Systems and methods to diagnose sarcoidosis and identify markers of the condition

ABSTRACT

Systems and methods to diagnose sarcoidosis are described. In addition to diagnosing sarcoidosis, the systems and methods can distinguish sarcoidosis from tuberculosis. Further disclosed is a cDNA library and methods of its use for reliably identifying sarcoidosis markers.

CROSS REFERENCE TO RELATED APPLICATION

This application is a continuation of co-pending U.S. patent applicationSer. No. 15/555,419, filed Sep. 1, 2017, which is the U.S. NationalPhase of PCT/US2016/021035, filed Mar. 4, 2016, which claims the benefitof the earlier filing date of U.S. Provisional Patent Application No.62/128,436, filed on Mar. 4, 2015. Each of these earlier applications isincorporated herein by reference in its entirety as if fully set forthherein.

STATEMENT OF GOVERNMENT INTEREST

This invention was made with government support under grant HL104481awarded by the National Institutes of Health. The government has certainrights in the invention.

FIELD OF THE DISCLOSURE

The current disclosure provides systems and methods to diagnosesarcoidosis. In addition to diagnosing sarcoidosis, the systems andmethods can distinguish sarcoidosis from tuberculosis. Further disclosedis a cDNA library and methods of its use for reliably identifyingsarcoidosis markers.

SEQUENCE LISTING

A computer readable text file, entitled “2DE7582.txt (SequenceListing.txt)” created on or about Oct. 10, 2020, with a file size of 152KB, contains the sequence listing for this application and is herebyincorporated by reference in its entirety.

BACKGROUND OF THE DISCLOSURE

Sarcoidosis, also called sarcoid, is a disease involving abnormalcollections of inflammatory cells (granulomas) that can form as nodulesin multiple organs. The granulomas are most often located in the lungsor its associated lymph nodes. The disease seems to be caused by animmune reaction to an infection or some other trigger.

Diagnosis of sarcoidosis is challenging as the signs and symptoms of thecondition are very broad, sometimes mimicking symptoms of otherdiseases. Further, symptoms can vary widely according to the organsystem affected by the disorder. This variance can lead to a delay indiagnosis, or inappropriate treatment, therefore demonstrating a needfor improved sarcoidosis diagnostic techniques.

The symptoms of sarcoidosis can also particularly resemble those causedby infection with tuberculosis. Thus, ability of a diagnostic toreliably distinguish between sarcoidosis and tuberculosis infectionwould allow faster treatment of each condition, resulting in bettertreatment outcomes.

SUMMARY OF THE DISCLOSURE

The present disclosure provides systems and methods to diagnosesarcoidosis in a subject. The systems and methods can distinguish asarcoidosis subject from a healthy subject and/or a subject havingtuberculosis. The systems and methods include diagnostic kits. Thesystems and methods also include a cDNA library to identify markers forsarcoidosis or tuberculosis diagnosis as well as methods of using thecDNA library to identify such markers, among others.

BRIEF DESCRIPTION OF THE FIGURES

FIG. 1 depicts a schematic diagram of identification of sarcoidosisantigens. The process of combining phage-display technology, proteinmicroarray and bioinformatics tools to select a panel of novel clonesfor the diagnosis of sarcoidosis was used. A cDNA library wasconstructed from a pool of total RNA isolated from 20 bronchoalveolarcells (BAL) samples and 36 white blood cell (WBC) samples from sarcoidpatients, and then combined with RNA extracts from cultured humanmonocytes and human embryonic fibroblasts. After digestion, the cDNAlibrary was inserted into the T7 phage vector and packaged into T7phages to generate a sarcoid cDNA-phage-display library. Several roundsof biopannings of the library were performed with pooled control serafor negative selection, and with sarcoid sera for positive enrichments.After four rounds of biopanning, enriched sarcoid specific peptideclones were cultured onto LB agar plates. A total of 1152 singlecolonies, including positive and negative clones were randomly pickedand propagated into 96-well plates. Phage-clone lysates were thenprinted robotically onto coated glass slides to create asarcoid-phage-protein microarray. Cy5 (red fluorescent dye)-labeledantihuman antibody was used to detect IgGs in human serum that werereactive to peptide clones, and a Cy3 (green fluorescent dye)-labeledantibody was used to detect the phage capsid protein in order tonormalize for spotting. Thus, if a phage clone carried a peptidereactive to human IgG, the spot remains green suggesting an unreactiveclone. A total of 115 sarcoid sera, 64 healthy control sera and 17 TBsera were tested on the 1152 phage peptide microarray. Bioinformaticallyanalyzed data identified 259 antigens with the highest level ofdifferentiation between sarcoidosis and healthy controls.

FIG. 2A shows a heatmap generated by applying meta-analysis usingmicroarray analysis of 2 separate data sets derived from 115 sarcoidosispatients and 64 healthy controls. Data reflects 259 antigens expressedsignificantly differently between healthy controls and sarcoidosissubjects in immunoscreening using sera. The 259 antigens were furtherdivided into three categories according to the AW-OC method. I: 78antigens were consistently up- or down-regulated in sarcoidosis in bothdatasets; II: 115 antigens were up- or down-regulated in sarcoidosis inthe second dataset only; III: 66 antigens were up- or down-regulated insarcoidosis in the first dataset only. FIG. 2B shows receiver operatingcharacteristics (ROC) curves demonstrating the performance of 32classifiers to discriminate between healthy controls and sarcoidosissubjects.

FIGS. 3A-3J show receiver operating characteristics (ROC) curves for thetop 10 sarcoidosis clones as follows: 3A (CCL21); 3B (Metap1); 3C (PC4);3D (CLI_3190); 3E (TNFRSF21); 3F (CD14); 3G (DNAJC1); 3H (APBB1); 3I(FGFBP-2); and 3J (SH3YL1).

FIG. 4 shows a Venn diagram depicting differential phage clonesignificances among sarcoidosis, TB and healthy controls (q<0.01). TheVenn diagram shows the overlap between 259 sarcoidosis clones and 238 TBclones as compared to healthy controls, as well as 380 TB clones versussarcoidosis. Forty-seven clones could differentiate both sarcoidosis andTB from healthy controls. Five clones could not discriminate between TBand sarcoidosis.

FIG. 5A shows a heatmap generated from a microarray analysis using 3data sets derived from 115 sarcoidosis patients, 64 control subjects and17 TB patients. Fifty antigens showed significant differentialexpression among the three groups. FIG. 5B shows an enlarged version ofclone identifiers to increase legibility of FIG. 5A.

FIG. 6 shows modified linkers distinguishing between the origins of eachlibrary. Each cDNA library was tagged with a modified linker:EcoRI/HindIII was used for BAL cDNA, ALA for WBC cDNA, LEU for MARC5cDNA and THR for EL1 cDNA.

FIG. 7A shows a graphical representation of the output eluent phagetiters as a function of biopanning (BP) showing exponential enrichmentof the output eluent phage titers after the completion of each cycle ofbiopannings. FIG. 7B shows PCR amplification of the phage clones pickedup from biopannings 3 & 4 (BP3 & BP4) showing retention of diversity inthe pool of immunoreactive phage.

FIG. 8 shows sequence analysis of the top 10 sarcoid phage clones usingNCBI BLAST.

FIG. 9 shows sequence analysis of the top 10 TB phage clones using NCBIBLAST.

FIG. 10 shows an illustrative schematic for using computational tools aspart of a process for diagnosing sarcoidosis, including an illustrativediagram of a computing device implementing the diagnostic framework.

FIG. 11 shows an illustrative process for diagnosing sarcoidosis.

DETAILED DESCRIPTION

Sarcoidosis is a multisystem granulomatous inflammatory disease. Thedisease is typically characterized by the formation of small, granularinflammatory lesions or granulomas (e.g., non-caseating granulomas) in avariety of organs, and/or the presence of immune responses (e.g.,presence of CD4+T lymphocytes and macrophages) in affected tissues ororgans. Granulomatous inflammation may be attributed to the accumulationof monocytes, macrophages, and a pronounced Th1 response and activatedT-lymphocytes, with elevated production of TNFα, IL-2, IL-12, IFNγ,IL-1, IL-6 or IL-15.

Exemplary subtypes of sarcoidosis include systemic sarcoidosis,Lofgren's syndrome, pulmonary sarcoidosis, cutaneous sarcoidosis,neurosarcoidosis, cardiac sarcoidosis, ocular sarcoidosis, hepaticsarcoidosis, musculoskeletal sarcoidosis, renal sarcoidosis, orsarcoidosis with the involvement of other organs or tissues.

Systemic sarcoidosis is sarcoidosis with multiple organ involvement.Symptoms of systemic sarcoidosis include aches, arthritis, chills, drymouth, enlarged lymph glands (e.g., armpit lump), fatigue, fever, lossof appetite, night sweats, nosebleed, pains, persistent cough, malaise,shortness of breath, weakness, and weight loss. Because systemicsarcoidosis involves multiple organs, symptoms described below for othermore particular types of sarcoidosis can also be relevant to systemicsarcoidosis.

Lofgren's syndrome represents an acute presentation of systemicsarcoidosis, typically characterized by the triad of erythema nodosum,bilateral hilar denopathy and arthritis or arthralgias. It can also beaccompanied by fever.

Pulmonary sarcoidosis refers to sarcoidosis that affects pulmonarytissues or organs (e.g., lungs). Symptoms of pulmonary sarcoidosisusually include normal, abnormal or deteriorating lung function;abnormal lung stiffness; bleeding from the lung tissue; cough; decreasedlung volume; decreased vital capacity (full breath in, to full breathout); enlarged lymph nodes in the chest; granulomas in alveolar septa,bronchiolar, and/or bronchial walls; higher than normal expiratory flowratios; an increased FEV₁/FVC ratio; limited amount of air drawn intothe lungs; loss of lung volume; obstructive lung changes; pulmonaryhypertension; pulmonary failure; scarring of lung tissue; and/orshortness of breath.

Cutaneous sarcoidosis is a complication of sarcoidosis with skininvolvement. Cutaneous sarcoidosis includes annular sarcoidosis,erythrodermic sarcoidosis, hypopigmented sarcoidosis, ichthyosiformsarcoidosis, morpheaform sarcoidosis, mucosal sarcoidosis, papularsarcoid, scar sarcoid, subcutaneous sarcoidosis and ulcerativesarcoidosis. Symptoms of cutaneous sarcoidosis include erythema nodosum(e.g., raised, red, firm skin sores, cellulitis, furunculosis or otherinflammatory panniculitis); hair loss; lupus pernio (e.g., scar ordiscoid lupus erythematosus); maculopapular eruptions; nodular lesions;papules (e.g., granulomatous rosacea, acne or benign appendagealtumors); skin lesions; skin plaques (e.g., psoriasis, lichen planus,nummular eczema, discoid lupus erythematosus, granuloma annulare,cutaneous T-cell lymphoma, Kaposi's sarcoma or secondary syphilis); skinrashes, and/or scars becoming more raised.

Neurosarcoidosis or neurosarcoid refers to sarcoidosis in whichinflammation and abnormal deposits occur in the brain, spinal cord, andany other areas of the nervous system. Symptoms of neurosarcoidosis caninclude abnormal or loss of sense of smell; abnormal or loss of sense oftaste; carpal tunnel syndrome; changes in menstrual periods; confusion;decreased hearing; delirium; dementia; disorientation; dizziness; doublevision or other vision problems or changes; excessive thirst; excessivetiredness (e.g., fatigue); facial palsy, weakness or drooping; headache;high urine output; hypopituitarism; loss of bowel or bladder control;muscle weakness; paraplegia; psychiatric disturbances; radicular pain;retinopathy; seizures; sensory losses; speech impairment; and/orvertigo.

The systems and methods disclosed herein can be used to diagnosesarcoidosis. In particular embodiments, the diagnosed sarcoidosis issystemic sarcoidosis, pulmonary sarcoidosis, cutaneous sarcoidosis,Lofgren's syndrome, neurosarcoidosis, cardiac sarcoidosis, ocularsarcoidosis, hepatic sarcoidosis, musculoskeletal sarcoidosis, renalsarcoidosis, or sarcoidosis with the involvement of other organs ortissues. In more particular embodiments, the systems and methodsdisclosed herein can be used to diagnose pulmonary sarcoidosis,neurosarcoidosis, and/or ocular sarcoidosis.

Typically, a sarcoidosis patient will present with symptoms describedabove or clinical features set out in the Statement on Sarcoidosispublished by the American Thoracic Society (Am. J. Respir. Crit. CareMed. 1999; 160(2):736-55). Sarcoidosis patients may often, however, beasymptomatic. Further the common symptoms of sarcoidosis are vague, andcan sometimes be similar to symptoms of numerous other conditionsincluding lymphoma and tuberculosis. Thus, diagnosis is difficult.

Currently, subjects with suspected sarcoidosis are typically assessedwith a chest assessment for pulmonary involvement, as the vast majorityof sarcoidosis subjects have pulmonary involvement. These assessmentsare generally based upon a bronchoscopy with biopsy; chest X-ray; CTscan; CT-guided biopsy; lung gallium (Ga) scan; mediastinoscopy; openlung biopsy; PET scan and/or a radiograph. Radiographs are typicallyassigned a stage of 0-4 according to the presence or absence of hilaradenopathy and parenchymal disease. Thus there are five stages: Stage 0:no visible intrathoracic findings; Stage 1: bilateral hilarlymphadenopathy (BHL), which may be accompanied by paratrachealadenopathy/lung fields are clear of infiltrates; Stage 2: bilateralhilar adenopathy (BHL) accompanied by parenchymal infiltration; Stage 3:parenchymal infiltration without bilateral hilar adenopathy (BHL); orStage 4: advanced pulmonary fibrosis with evidence of honey-combing,hilar retraction, bullae, cysts, and emphysema.

The present disclosure provides significant advancements in thediagnosis of sarcoidosis because diagnosis can be achieved with, forexample, a blood test and can distinguish sarcoidosis subjects fromhealthy subjects and/or subjects having tuberculosis.

The systems and methods disclosed herein were achieved by creating andscreening a complex cDNA library. Particularly, a heterologous cDNAlibrary derived from bronchoalveolar cell (BAL) samples and total whiteblood cells (WBC) from sarcoidosis patients was developed. Bothsarcoid-derived libraries were combined with cultured human monocytesand embryonic lung fibroblast cDNA libraries to build a complexsarcoidosis library (CSL). Differential biopanning for negative andpositive selection was performed using sera from healthy controls toremove non-specific IgG, and sarcoidosis sera for selective enrichment.Four rounds of biopannings were performed and the selected phagelibraries were used for microarray immunoscreening. Each cycle ofbiopanning included passing the entire phage library through protein Gbeads coated with IgG from pooled sera of healthy controls, then passingthrough beads coated with IgGs from individual serum of sarcoidsubjects.

After biopanning, phage clones were randomly selected and amplified andtheir lysates were arrayed in quintuplicates onto slides (Grace Biolabs,OR) using the ProSys 5510TL robot (Cartesian Technologies, CA). It wastested whether this novel library representing relevant antigens wouldspecifically recognize high IgG titer in sera of sarcoidosis subjects.

Using bioinformatics tools, a large number of markers with highsensitivity and specificity were identified that discriminate among thesera of patients with sarcoidosis, healthy controls and TB. Using theintegrative-analysis method that combines results from two independenttrials, clones that significantly differentiated sarcoidosis fromcontrols were identified. Similarly, clones that differentially reactedwith TB sera and not with sarcoidosis or control sera were identified.Furthermore, the top 10 discriminating antigens for TB and sarcoidosiswere sequenced and homologies were identified in a public data base.These data indicate development of a unique library enabling thedetection of highly significant antigens to discriminate betweenpatients with sarcoidosis and tuberculosis.

An antigen is a substance that induces an immune response. Accordingly,the antigens detected from the library are markers useful for diagnosingsarcoidosis and TB.

The systems and methods diagnose sarcoidosis by assaying a sampleobtained from a subject for the up- or down-regulation of one or moremarkers associated with sarcoidosis. The markers are selected from Smallinducible cytokine A21 precursor (CCL21); Methionine aminopeptidase 1(Metap1); Activated RNA polymerase II transcription cofactor variant 4(PC4); RNA methyltransferase (CLI_3190); Tumor necrosis factor receptorsuperfamily member 21 precursor (also known as death receptor 6 (DR6))(TNFRSF21); Monocyte differentiation antigen CD14 (CD14); DnaJ (Hsp40)homolog subfamily C member 1 precursor (DNAJC1); Amyloid β A4 precursorprotein-binding family B member 1-interacting protein (APBB1);Fibroblast growth factor binding protein 2 precursor (FGFBP-2); SH3domain-containing YSC84 like protein 1 (SH3YL1); thioester reductase[Pseudomonas fluorescens] (PFWH6_0117); histidine kinase [Pseudomonasfluorescens] (PFL_3193); Homo sapiens chromatin modifying protein 4B(CHMP4B); hypothetical protein [Porphyromonas somerae] Peptidase familyC39 mostly contains bacteriocin-processing endopeptidases from bacteria;truncated HIC1 protein [Homo sapiens] (H1C1); replication protein[Mycobacterium] (MVAC_06252); Homo sapiens ribosomal protein S2 (RPS2);triosephosphate isomerase [Mycobacterium tuberculosis] (tpiA); membraneprotein [Mycobacterium tuberculosis] (Rv2563); serine/threonine proteinkinase [Mycobacterium tuberculosis] (Rv0410C); PPE family protein[Mycobacterium tuberculosis RGTB423] (MRGA423_16320); rRNAmethyltransferase [Mycobacterium tuberculosis] (Rv0881); peroxisomebiogenesis factor 10 isoform 1 [Homo sapiens] (PEX10); sulfate ABCtransporter permease [Mycobacterium tuberculosis] (CysU); and/orD-alpha-D-heptose-7-phosphate kinase [Mycobacterium tuberculosis](hddA).

In particular embodiments, the systems and methods diagnose sarcoidosisby assaying a sample obtained from a subject for the up- ordown-regulation of two or more; three or more; four or more; five ormore; six or more; seven or more; eight or more; nine or more or ten ormore markers associated with sarcoidosis disclosed herein. In furtherembodiments, the systems and methods diagnose sarcoidosis by assaying asample obtained from a subject for the up- or down-regulation of two;three; four; five; six; seven; eight; nine or ten markers associatedwith sarcoidosis disclosed herein.

In one embodiment, the markers include (hereafter referred to by geneabbreviations for brevity) CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14;DNAJC1; APBB1; FGFBP-2; and SH3YL1. In another embodiment, the markersinclude CCL21; Metap1; PC4; CLI_3190; TNFRSF21; and APBB1. In anotherembodiment, the markers include CCL21, PC4, CL13190, DNAJC1, APBB1,FGFBP-2 and SH3YL1. In another embodiment, the markers include CCL21;Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; orSH3YL1. In another embodiment, the markers include CCL21; Metap1;CLI-3190; APBB1; and SH3YL1.

In other embodiments, the markers include CCL21 in combination with two,three, four, five, six, seven, eight or nine markers selected from:Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; andSH3YL1.

In other embodiments, the markers include Metap1 in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:CCL21; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; andSH3YL1.

In other embodiments, the markers include PC4 in combination with two,three, four, five, six, seven, eight or nine markers selected from:CCL21; Metap1; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; andSH3YL1.

In other embodiments, the markers include CLI_3190 in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:CCL21; Metap1; PC4; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; and SH3YL1.

In other embodiments, the markers include TNFRSF21 in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:CCL21; Metap1; PC4; CLI_3190; CD14; DNAJC1; APBB1; FGFBP-2; and SH3YL1.

In other embodiments, the markers include CD14 in combination with two,three, four, five, six, seven, eight or nine markers selected from:CCL21; Metap1; PC4; CLI_3190; TNFRSF21; DNAJC1; APBB1; FGFBP-2; andSH3YL1.

In other embodiments, the markers include DNAJC1 in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; APBB1; FGFBP-2; andSH3YL1.

In other embodiments, the markers include APBB1 in combination with two,three, four, five, six, seven, eight or nine markers selected from:CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; FGFBP-2; andSH3YL1.

In other embodiments, the markers include FGFBP-2 in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; and SH3YL1.

In other embodiments, the markers include SH3YL1 in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; andFGFBP-2.

In other embodiments, the markers exclude a marker selected from CCL21;Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; andSH3YL1. In other embodiments, the markers exclude CCL21. In otherembodiments, the markers exclude Metap1. In other embodiments, themarkers exclude PC4. In other embodiments, the markers exclude CLI_3190.In other embodiments, the markers exclude TNFRSF21. In otherembodiments, the markers exclude CD14. In other embodiments, the markersexclude DNAJC1. In other embodiments, the markers exclude APBB1. Inother embodiments, the markers exclude FGFBP-2. In other embodiments,the markers exclude SH3YL1. In other embodiments, the markers excludeone or more of Metap 1; TNFRSF21; and CD14. In other embodiments, themarkers exclude Metap 1; TNFRSF21; and CD14.

Any of the embodiments described above can additionally include a markerselected from PFWH6_0117; PFL_3193; CHMP4B; hypothetical protein[Porphyromonas somerae] Peptidase family C39 mostly containsbacteriocin-processing endopeptidases from bacteria; H1C1; MVAC_06252;RPS2; tpiA; Rv2563; Rv0410C; MRGA423_16320; Rv0881; PEX10; CysU; andhddA. In particular embodiments, the additional marker includesPFWH6_0117. In particular embodiments, the additional marker includesPFL_3193. In particular embodiments, the additional marker includesCHMP4B. In particular embodiments, the additional marker includeshypothetical protein Peptidase family C39 mostly containsbacteriocin-processing endopeptidases from bacteria. In particularembodiments, the additional marker includes H1C1. In particularembodiments, the additional marker includes MVAC_06252. In particularembodiments, the additional marker includes RPS2. In particularembodiments, the additional marker includes tpiA. In particularembodiments, the additional marker includes Rv2563. In particularembodiments, the additional marker includes Rv0410C. In particularembodiments, the additional marker includes MRGA423_16320. In particularembodiments, the additional marker includes Rv0881. In particularembodiments, the additional marker includes PEX10. In particularembodiments, the additional marker includes CysU. In particularembodiments, the additional marker includes hddA.

The systems and methods disclosed herein also allow distinguishingsarcoidosis from tuberculosis in a subject by assaying a sample obtainedfrom a subject for the up- or down-regulation of one or more markersthat distinguish sarcoidosis from tuberculosis. The markers include:Ferredoxin (Mycobacterium tuberculosis) (Fed A); WDFY3 protein (Homosapiens) (WDFY3); Membrane protein (Mycobacterium tuberculosis) (MFS);Leucine rich PPR-motif containing protein (Homo sapiens) (LRPPRC);HLA-DR alpha (Homo sapiens) (HLA-DR); Transketolase (Mycobacteriumtuberculosis) (TKT); Dihydroxy acid dehydratase (Mycobacteriumtuberculosis) (Rv0189C); Chain A Mycobacterium tuberculosis (BfrA);Disabled homolog 2 isoform 2 (Homo sapiens) (DAB2); and Transcriptionelongation factor B polypeptide 2 isoform (Homo sapiens) (TCEB2).

In particular embodiments, the systems and methods distinguishsarcoidosis from tuberculosis in a subject by assaying a sample obtainedfrom a subject for the up- or down-regulation of two or more; three ormore; four or more; five or more; six or more; seven or more; eight ormore; nine or more or ten or more markers that distinguish sarcoidosisfrom tuberculosis disclosed herein. In further embodiments, the systemsand methods distinguish sarcoidosis from tuberculosis by assaying asample obtained from a subject for the up- or down-regulation of two;three; four; five; six; seven; eight; nine or ten markers associatedwith sarcoidosis disclosed herein.

In one embodiment, the markers include (hereafter referred to by geneabbreviations for brevity) Fed A; WDFY3; MFS; LRPPRC; HLA-DR; TKT;Rv0189C; BfrA; DAB2; and TCEB2. In another embodiment, the markersinclude HLA-DR; MFS; DAB2; BfrA; and WDFY3. In another embodiment, themarkers include HLA-DR; MFS; DAB2; BfrA; or WDFY3. In anotherembodiment, the markers include HLA-DR; MFS; and DAB2. In anotherembodiment, the markers include HLA-DR; MFS; or DAB2.

In other embodiments, the markers include Fed A in combination with two,three, four, five, six, seven, eight or nine markers selected from:WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; and TCEB2.

In other embodiments, the markers include WDFY3 in combination with two,three, four, five, six, seven, eight or nine markers selected from: FedA; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; and TCEB2.

In other embodiments, the markers include MFS in combination with two,three, four, five, six, seven, eight or nine markers selected from: FedA; WDFY3; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; and TCEB2.

In other embodiments, the markers include LRPPRC in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:Fed A; WDFY3; MFS; HLA-DR; TKT; Rv0189C; BfrA; DAB2; and TCEB2.

In other embodiments, the markers include HLA-DR in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:Fed A; WDFY3; MFS; LRPPRC; TKT; Rv0189C; BfrA; DAB2; and TCEB2.

In other embodiments, the markers include TKT in combination with two,three, four, five, six, seven, eight or nine markers selected from: FedA; WDFY3; MFS; LRPPRC; HLA-DR; Rv0189C; BfrA; DAB2; and TCEB2.

In other embodiments, the markers include Rv0189C in combination withtwo, three, four, five, six, seven, eight or nine markers selected from:Fed A; WDFY3; MFS; LRPPRC; HLA-DR; TKT; BfrA; DAB2; and TCEB2.

In other embodiments, the markers include BfrA in combination with two,three, four, five, six, seven, eight or nine markers selected from: FedA; WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; DAB2; and TCEB2.

In other embodiments, the markers include DAB2 in combination with two,three, four, five, six, seven, eight or nine markers selected from: FedA; WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; and TCEB2.

In other embodiments, the markers include TCEB2 in combination with two,three, four, five, six, seven, eight or nine markers selected from: FedA; WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; and DAB2.

In other embodiments, the markers exclude a marker selected from Fed A;WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; and TCEB2. Inother embodiments, the markers exclude Fed A. In other embodiments, themarkers exclude WDFY3. In other embodiments, the markers exclude MFS. Inother embodiments, the markers exclude LRPPRC. In other embodiments, themarkers exclude HLA-DR. In other embodiments, the markers exclude TKT.In other embodiments, the markers exclude Rv0189C. In other embodiments,the markers exclude BfrA. In other embodiments, the markers excludeDAB2. In other embodiments, the markers exclude TCEB2.

“Up-regulation” or “up-regulated” means an increase in the presence of aprotein and/or an increase in the expression of its gene.“Down-regulation” or “down-regulated” means a decrease in the presenceof a protein and/or a decrease in the expression of its gene. “It'sgene” in reference to a particular protein refers to a nucleic acidsequence (used interchangeably with polynucleotide or nucleotidesequence) that encodes the particular protein. This definition alsoincludes various sequence polymorphisms, mutations, and/or sequencevariants wherein such alterations do not substantially affect theidentity or function of the particular protein. For example, in asequence identity analysis, the test protein would share at least 80%sequence identity; at least 81% sequence identity; at least 82% sequenceidentity; at least 83% sequence identity; at least 84% sequenceidentity; at least 85% sequence identity; at least 86% sequenceidentity; at least 87% sequence identity; at least 88% sequenceidentity; at least 89% sequence identity; at least 90% sequenceidentity; at least 91% sequence identity; at least 92% sequenceidentity; at least 93% sequence identity; at least 94% sequenceidentity; at least 95% sequence identity; at least 96% sequenceidentity; at least 97% sequence identity; at least 98% sequence identityor at least 99% sequence identity with the particular protein.

“% sequence identity” refers to a relationship between two or moresequences, as determined by comparing the sequences. In the art,“identity” also means the degree of sequence relatedness between protein(or nucleic acid) sequences as determined by the match between stringsof such sequences. “Identity” (often referred to as “similarity”) can bereadily calculated by known methods, including those described in:Computational Molecular Biology (Lesk, A. M., ed.) Oxford UniversityPress, N Y (1988); Biocomputing: Informatics and Genome Projects (Smith,D. W., ed.) Academic Press, N Y (1994); Computer Analysis of SequenceData, Part I (Griffin, A. M., and Griffin, H. G., eds.) Humana Press, NJ (1994); Sequence Analysis in Molecular Biology (Von Heijne, G., ed.)Academic Press (1987); and Sequence Analysis Primer (Gribskov, M. andDevereux, J., eds.) Oxford University Press, NY (1992). Preferredmethods to determine sequence identity are designed to give the bestmatch between the sequences tested. Methods to determine sequenceidentity and similarity are codified in publicly available computerprograms. Sequence alignments and percent identity calculations may beperformed using the Megalign program of the LASERGENE bioinformaticscomputing suite (DNASTAR, Inc., Madison, Wis.). Multiple alignment ofthe sequences can also be performed using the Clustal method ofalignment (Higgins and Sharp CABIOS, 5, 151-153 (1989) with defaultparameters (GAP PENALTY=10, GAP LENGTH PENALTY=10). Relevant programsalso include the GCG suite of programs (Wisconsin Package Version 9.0,Genetics Computer Group (GCG), Madison, Wis.); BLASTP, BLASTN, BLASTX(Altschul, et al., J. Mol. Biol. 215:403-410 (1990); DNASTAR (DNASTAR,Inc., Madison, Wis.); and the FASTA program incorporating theSmith-Waterman algorithm (Pearson, Comput. Methods Genome Res., [Proc.Int. Symp.] (1994), Meeting Date 1992, 111-20. Editor(s): Suhai, Sandor.Publisher: Plenum, New York, N.Y. Within the context of this disclosureit will be understood that where sequence analysis software is used foranalysis, the results of the analysis are based on the “default values”of the program referenced. “Default values” mean any set of values orparameters which originally load with the software when firstinitialized.

The function of a protein can be assayed by a relevant activity assay.Function is not substantially affected if there is no statisticallysignificant difference in activity between the particular protein andthe test protein. Exemplary activity assays include binding assays, or,if the protein is an enzyme, enzyme activity assays including, forexample, protease assays, kinase assays, phosphatase assays, reductaseassays, etc. Modulation of the kinetics of enzyme activities can bedetermined by measuring the rate constant KM using known algorithms,such as the Hill plot, Michaelis-Menten equation, linear regressionplots such as Lineweaver-Burk analysis, and Scatchard plot.

The term “gene” can include not only coding sequences but alsoregulatory regions such as promoters, enhancers, and terminationregions. The term further can include all introns and other DNAsequences spliced from the mRNA transcript, along with variantsresulting from alternative splice sites. Gene sequences encoding theparticular protein can be DNA or RNA that directs the expression of theparticular protein. These nucleic acid sequences may be a DNA strandsequence that is transcribed into RNA or an RNA sequence that istranslated into the particular protein. The nucleic acid sequencesinclude both the full-length nucleic acid sequences as well asnon-full-length sequences derived from the full-length protein. Thesequences can also include degenerate codons of the native sequence.Portions of complete gene sequences are referenced throughout thedisclosure as is understood by one of ordinary skill in the art.

Up- or down-regulation of the markers, as indicated elsewhere herein forparticular markers can be assessed by comparing a value to a relevantreference level. For example, the quantity of one or more markers can beindicated as a value. The value can be one or more numerical valuesresulting from the assaying of a sample, and can be derived, e.g., bymeasuring level(s) of the marker(s) in the sample by an assay performedin a laboratory, or from a dataset obtained from a provider such as alaboratory, or from a dataset stored on a server. The markers disclosedherein can be a protein marker or a nucleic acid marker (gene encodingthe protein marker).

In the broadest sense, the value may be qualitative or quantitative. Assuch, where detection is qualitative, the systems and methods provide areading or evaluation, e.g., assessment, of whether or not the marker ispresent in the sample being assayed. In yet other embodiments, thesystems and methods provide a quantitative detection of whether themarker is present in the sample being assayed, i.e., an evaluation orassessment of the actual amount or relative abundance of the marker inthe sample being assayed. In such embodiments, the quantitativedetection may be absolute or, if the method is a method of detecting twoor more different markers in a sample, relative. As such, the term“quantifying” when used in the context of quantifying a marker in asample can refer to absolute or to relative quantification. Absolutequantification can be accomplished by inclusion of knownconcentration(s) of one or more control markers and referencing, e.g.,normalizing, the detected level of the marker with the known controlmarkers (e.g., through generation of a standard curve). Alternatively,relative quantification can be accomplished by comparison of detectedlevels or amounts between two or more different markers to provide arelative quantification of each of the two or more markers, e.g.,relative to each other. The actual measurement of values of the markerscan be determined at the protein or nucleic acid level using any methodknown in the art. In some embodiments, a marker is detected bycontacting a sample with reagents (e.g., antibodies or nucleic acidprimers), generating complexes of reagent and marker(s), and detectingthe complexes.

The reagent can include a probe. A probe is a molecule that binds atarget, either directly or indirectly. The target can be a marker, afragment of the marker, or any molecule that is to be detected. Inembodiments, the probe includes a nucleic add or a protein. As anexample, a protein probe can be an antibody. An antibody can be a wholeantibody or a fragment of an antibody. A probe can be labeled with adetectable label. Examples of detectable labels include fluorescers,chemiluminescers, dyes, enzymes, enzyme substrates, enzyme cofactors,enzyme inhibitors, enzyme subunits, metal ions, and radioactiveisotopes.

“Protein” detection includes detection of full-length proteins, matureproteins, pre-proteins, polypeptides, isoforms, mutations,post-translationally modified proteins and variants thereof, and can bedetected in any suitable manner.

Those skilled in the art will be familiar with numerous specificimmunoassay formats and variations thereof which can be useful forcarrying out the methods disclosed herein. See, e.g., E. Maggio,Enzyme-Immunoassay (1980), CRC Press, Inc., Boca Raton, Fla.; and U.S.Pat. Nos. 4,727,022; 4,659,678; 4,376,110; 4,275,149; 4,233,402; and4,230,797.

Antibodies can be conjugated to a solid support suitable for adiagnostic assay (e.g., beads such as protein A or protein G agarose,microspheres, plates, slides or wells formed from materials such aslatex or polystyrene) in accordance with known techniques, such aspassive binding. Antibodies can be conjugated to detectable labels orgroups such as radiolabels (e.g., 35S, 125I, 131I), enzyme labels (e.g.,horseradish peroxidase, alkaline phosphatase), and fluorescent labels(e.g., fluorescein, Alexa, green fluorescent protein, rhodamine) inaccordance with known techniques.

Examples of suitable immunoassays include immunoblotting,immunoprecipitation, immunofluorescence, chemiluminescence,electro-chemiluminescence (ECL), and/or enzyme-linked immunoassays(ELISA).

Antibodies may also be useful for detecting post-translationalmodifications of markers. Examples of post-translational modificationsinclude tyrosine phosphorylation, threonine phosphorylation, serinephosphorylation, citrullination and glycosylation (e.g., O-GlcNAc). Suchantibodies specifically detect the phosphorylated amino acids in markerproteins of interest. These antibodies are well-known to those skilledin the art, and commercially available. Post-translational modificationscan also be determined using metastable ions in reflectormatrix-assisted laser desorption ionization-time of flight massspectrometry (MALDI-TOF). See U. Wirth et al., Proteomics 2002,2(10):1445-1451.

Up- or down-regulation of genes also can be detected using, for example,cDNA arrays, cDNA fragment fingerprinting, cDNA sequencing, clonehybridization, differential display, differential screening, FRETdetection, liquid microarrays, PCR, RT-PCR, quantitative real-timeRT-PCR analysis with TaqMan assays, molecular beacons, microelectricarrays, oligonucleotide arrays, polynucleotide arrays, serial analysisof gene expression (SAGE), and/or subtractive hybridization.

As an example, Northern hybridization analysis using probes whichspecifically recognize one or more marker sequences can be used todetermine gene expression. Alternatively, expression can be measuredusing RT-PCR; e.g., polynucleotide primers specific for thedifferentially expressed marker mRNA sequences reverse-transcribe themRNA into DNA, which is then amplified in PCR and can be visualized andquantified. Marker RNA can also be quantified using, for example, othertarget amplification methods, such as transcription mediatedamplification (TMA), strand displacement amplification (SDA), andnucleic acid sequence based amplification (NASBA), or signalamplification methods (e.g., bDNA), and the like. Ribonucleaseprotection assays can also be used, using probes that specificallyrecognize one or more marker mRNA sequences, to determine geneexpression.

Further hybridization technologies that may be used are described in,for example, U.S. Pat. Nos. 5,143,854; 5,288,644; 5,324,633; 5,432,049;5,470,710; 5,492,806; 5,503,980; 5,510,270; 5,525,464; 5,547,839;5,580,732; 5,661,028; and 5,800,992 as well as WO 95/21265; WO 96/31622;WO 97/10365; WO 97/27317; EP 373 203; and EP 785 280.

Proteins and nucleic acids can be linked to chips, such as microarraychips. See, for example, U.S. Pat. Nos. 5,143,854; 6,087,112; 5,215,882;5,707,807; 5,807,522; 5,958,342; 5,994,076; 6,004,755; 6,048,695;6,060,240; 6,090,556; and 6,040,138. Microarray refers to a solidcarrier or support that has a plurality of molecules bound to itssurface at defined locations. The solid carrier or support can be madeof any material. As an example, the material can be hard, such as metal,glass, plastic, silicon, ceramics, and textured and porous materials; orsoft materials, such as gels, rubbers, polymers, and other non-rigidmaterials. The material can also be nylon membranes, epoxy-glass andborofluorate-glass. The solid carrier or support can be flat, but neednot be and can include any type of shape such as spherical shapes (e.g.,beads or microspheres). The solid carrier or support can have a flatsurface as in slides and micro-titer plates having one or more wells.

Binding to proteins or nucleic acids on microarrays can be detected byscanning the microarray with a variety of laser or CCD-based scanners,and extracting features with software packages, for example, Imagene(Biodiscovery, Hawthorne, Calif.), Feature Extraction Software(Agilent), Scanalyze (Eisen, M. 1999. SCANALYZE User Manual; StanfordUniv., Stanford, Calif. Ver 2.32.), or GenePix (Axon Instruments).

Embodiments disclosed herein can be used with high throughput screening(HTS). Typically, HTS refers to a format that performs at least about100 assays, at least about 500 assays, at least about 1000 assays, atleast about 5000 assays, at least about 10,000 assays, or more per day.When enumerating assays, either the number of samples or the number ofprotein or nucleic acid markers assayed can be considered.

Generally HTS methods involve a logical or physical array of either thesubject samples, or the protein or nucleic acid markers, or both.Appropriate array formats include both liquid and solid phase arrays.For example, assays employing liquid phase arrays, e.g., forhybridization of nucleic acids, binding of antibodies or other receptorsto ligand, etc., can be performed in multiwell or microtiter plates.Microtiter plates with 96, 384, or 1536 wells are widely available, andeven higher numbers of wells, e.g., 3456 and 9600 can be used. Ingeneral, the choice of microtiter plates is determined by the methodsand equipment, e.g., robotic handling and loading systems, used forsample preparation and analysis.

HTS assays and screening systems are commercially available from, forexample, Zymark Corp. (Hopkinton, Mass.); Air Technical Industries(Mentor, Ohio); Beckman Instruments, Inc. (Fullerton, Calif.); PrecisionSystems, Inc. (Natick, Mass.), etc. These systems typically automateentire procedures including all sample and reagent pipetting, liquiddispensing, timed incubations, and final readings of the microplate indetector(s) appropriate for the assay. These configurable systemsprovide HTS as well as a high degree of flexibility and customization.The manufacturers of such systems provide detailed protocols for thevarious methods of HTS.

As stated previously, obtained marker values can be compared to areference level. Reference levels can be obtained from one or morerelevant datasets. A “dataset” as used herein is a set of numericalvalues resulting from evaluation of a sample (or population of samples)under a desired condition. The values of the dataset can be obtained,for example, by experimentally obtaining measures from a sample andconstructing a dataset from these measurements. As is understood by oneof ordinary skill in the art, the reference level can be based on e.g.,any mathematical or statistical formula useful and known in the art forarriving at a meaningful aggregate reference level from a collection ofindividual datapoints; e.g., mean, median, median of the mean, etc.Alternatively, a reference level or dataset to create a reference levelcan be obtained from a service provider such as a laboratory, or from adatabase or a server on which the dataset has been stored.

A reference level from a dataset can be derived from previous measuresderived from a population. A “population” is any grouping of subjects orsamples of like specified characteristics. The grouping could beaccording to, for example, clinical parameters, clinical assessments,therapeutic regimens, disease status, severity of condition, etc.

Subjects include humans, veterinary animals (dogs, cats, reptiles,birds, hamsters, etc.) livestock (horses, cattle, goats, pigs, chickens,etc.), research animals (monkeys, rats, mice, fish, etc.) and otheranimals, such as zoo animals (e.g., bears, giraffe, elephant, lemurs).

In particular embodiments, conclusions are drawn based on whether asample value is statistically significantly different or notstatistically significantly different from a reference level. A measureis not statistically significantly different if the difference is withina level that would be expected to occur based on chance alone. Incontrast, a statistically significant difference or increase is one thatis greater than what would be expected to occur by chance alone.Statistical significance or lack thereof can be determined by any ofvarious methods well-known in the art. An example of a commonly usedmeasure of statistical significance is the p-value. The p-valuerepresents the probability of obtaining a given result equivalent to aparticular datapoint, where the datapoint is the result of random chancealone. A result is often considered significant (not random chance) at ap-value less than or equal to 0.05.

In one embodiment, values obtained about the markers and/or otherdataset components can be subjected to an analytic process with chosenparameters. The parameters of the analytic process may be thosedisclosed herein or those derived using the guidelines described herein.The analytic process used to generate a result may be any type ofprocess capable of providing a result useful for classifying a sample,for example, comparison of the obtained value with a reference level, alinear algorithm, a quadratic algorithm, a decision tree algorithm, or avoting algorithm. The analytic process may set a threshold fordetermining the probability that a sample belongs to a given class. Theprobability preferably is at least at least 60%, at least 70%, at least80%, at least 90%, at least 95% or higher.

In embodiments, the relevant reference level for a particular marker isobtained based on the particular marker in control subjects. Controlsubjects are those that are healthy and do not have sarcoidosis ortuberculosis. As an example, the relevant reference level can be thequantity of the particular marker in the control subjects.

In additional embodiments when more than one marker is assayed, valuesof the detected markers can be calculated into a score. Each value canbe weighted evenly within an algorithm generating a score, or the valuesfor particular markers can be weighted more heavily in reaching thescore. For example, markers with higher sensitivity and/or specificityscores could be weighted more heavily than markers with lowersensitivity and/or specificity scores. For example, marker values fordiagnosing sarcoidosis may be weighted as follows (from highest weightto lowest weight): CCL21; APBB1; Metap1; SH3YL; CLI_3190; PC4; DNAJC1;TNFRSF21; CD14; FGFBP-2. Markers may also be grouped into classes, andeach class given a weighted score. For example, marker values fordiagnosing sarcoidosis may be grouped into classes and weighted asfollows (from highest weight to lowest weight): Class 1: CCL21 andAPBB1; Class 2: Metap1 and SH3YL; Class 3: CLI_3190 and PC4; Class 4:DNAJC1 and TNFRSF21; and Class 5: CD14 and FGFBP-2. As another example,marker values for diagnosing sarcoidosis may be grouped into classes andweighted as follows (from highest weight to lowest weight): Class 1:CCL21; APBB1; Metap1 and SH3YL; Class 2: CLI_3190; PC4; DNAJC1 andTNFRSF21; and Class 3: CD14 and FGFBP-2.

In particular embodiments, marker values for distinguishing sarcoidosisfrom tuberculosis may be weighted as follows (from highest weight tolowest weight): HLA-DR; MF5; BfrA; DAB2; WDFY3; FedA; TCEB2; Rv0189C;LRPPRC; TKT. Markers may also be grouped into classes, and each classgiven a weighted score. For example, marker values for diagnosingsarcoidosis may be grouped into classes and weighted as follows (fromhighest weight to lowest weight): Class 1: HLA-DR and MF5; Class 2: BfrAand DAB2; Class 3: WDFY3 and FedA; Class 4: TCEB2 and Rv0189C; and Class5: LRPPRC and TKT. As another example, marker values for diagnosingsarcoidosis may be grouped into classes and weighted as follows (fromhighest weight to lowest weight): Class 1: HLA-DR; MF5; BfrA; and DAB2;Class 2: WDFY3; FedA; TCEB2; and Rv0189C; and Class 3: LRPPRC; and TKT.

Any marker or class of markers can be excluded from a particular valuecalculation. For example, in particular embodiments, Class 5 isexcluded. In particular embodiments, Class 4 is excluded. In particularembodiments, Class 3 is excluded. In particular embodiments, Class 2 isexcluded. In particular embodiments, Class 1 is excluded. In furtherembodiments, groups of classes can be excluded, for example, Classes 5and 4; 5 and 3; 5 and 2; 4 and 3; 4 and 2; 3 and 2; etc.

Particular embodiments disclosed herein include obtaining a sample froma subject suspected of having sarcoidosis; assaying the sample for up-or down-regulation of one or more markers disclosed herein; determiningone or more marker values based on the assaying; comparing the one ormore marker values to a reference level; diagnosing sarcoidosis in thesubject according to the up- or down regulation of a marker, asdescribed elsewhere herein.

Particular embodiments also include distinguishing sarcoidosis fromtuberculosis in a subject by obtaining a sample from a subject suspectedof having sarcoidosis; assaying the sample for up- or down-regulation ofone or more markers disclosed herein; determining one or more markervalues based on the assaying; comparing the one or more marker values toa reference level; diagnosing sarcoidosis or tuberculosis in the subjectaccording to the up- or down regulation of a marker, as describedelsewhere herein.

The sample can be any appropriate biological sample obtained from thesubject, such as a blood sample, a serum sample, a saliva sample, aurine sample, bronchoalveolar lavage sample, etc. The sample also can beobtained from a biopsy of an affected tissue or organ, such as a lungbiopsy, or lymph gland biopsy. The sample can include cells of affectedtissue or organ.

A diagnosis according to the systems and methods disclosed herein candirect a treatment regimen. For example, a sarcoidosis diagnosis candirect treatment with a sarcoidosis treatment (e.g., lifestyle andbehavioral interventions; corticosteroids; methotrexate or azathioprine;hydroxychloroquine or chloroquine; cyclophosphamide or chlorambucil;pentoxifylline and thalidomide; infliximab or adalimumab; colchicine;various nonsteroidal anti-inflammatory drugs (NSAIDs, e.g., ibuprofen oraspirin); organ transplantation). A tuberculosis diagnosis can directtreatment with a tuberculosis treatment (e.g., isoniazid (INH); rifampin(RIF); ethambutol (EMB); pyrazinamide (PZA)). A healthy diagnosis candirect further medical analysis if the subject's symptoms suggestfurther analysis is warranted. Administered treatments will be deliveredin therapeutically effective amounts leading to an improvement orresolution of the treated condition, as assessed by a practicingphysician, veterinarian or researcher.

The systems and methods disclosed herein include kits. Disclosed kitsinclude materials and reagents necessary to assay a sample obtained froma subject for one or more markers disclosed herein. The materials andreagents can include those necessary to assay the markers disclosedherein according to any method described herein and/or known to one ofordinary skill in the art.

Particular embodiments include materials and reagents necessary to assayfor up- or down-regulation of a marker protein in a sample. Inparticular embodiments, the kits include antibodies to marker proteinsand/or can also include aptamers, epitopes or mimotopes. Otherembodiments additionally or alternatively include oligonucleotides thatspecifically assay for one or more marker nucleic acids based onhomology and/or complementarity with marker nucleic acids. Theoligonucleotide sequences may correspond to fragments of the markernucleic acids. For example, the oligonucleotides can be more than 200,175, 150, 100, 50, 25, 10, or fewer than 10 nucleotides in length.Collectively, any molecule (e.g., antibody, aptamer, epitope, mimotope,oligonucleotide) that forms a complex with a marker is referred to as amarker binding agent herein.

Embodiments of kits can contain in separate containers marker bindingagents either bound to a matrix, or packaged separately with reagentsfor binding to a matrix. In particular embodiments, the matrix is, forexample, a porous strip. In some embodiments, measurement or detectionregions of the porous strip can include a plurality of sites containingmarker binding agents. In some embodiments, the porous strip can alsocontain sites for negative and/or positive controls. Alternatively,control sites can be located on a separate strip from the porous strip.Optionally, the different detection sites can contain different amountsof marker binding agents, e.g., a higher amount in the first detectionsite and lesser amounts in subsequent sites. Upon the addition of testsample, the number of sites displaying a detectable signal provides aquantitative indication of the amount of marker present in the sample.The detection sites can be configured in any suitably detectable shapeand can be, e.g., in the shape of a bar or dot spanning the width (or aportion thereof) of a porous strip.

In some embodiments the matrix can be a solid substrate, such as a“chip.” See, e.g., U.S. Pat. No. 5,744,305. In some embodiments thematrix can be a solution array; e.g., xMAP (Luminex, Austin, Tex.),Cyvera (Illumina, San Diego, Calif.), RayBio Antibody Arrays(RayBiotech, Inc., Norcross, Ga.), CellCard (Vitra Bioscience, MountainView, Calif.) and Quantum Dots' Mosaic (Invitrogen, Carlsbad, Calif.).

Additional embodiments can include control formulations (positive and/ornegative), and/or one or more detectable labels, such as fluorescein,green fluorescent protein, rhodamine, cyanine dyes, Alexa dyes,luciferase, and radiolabels, among others. Instructions for carrying outthe assay, including, optionally, instructions for generating a score,can be included in the kit; e.g., written, tape, VCR, or CD-ROM.

In particular embodiments, the kits include materials and reagentsnecessary to conduct and immunoassay (e.g., ELISA). In particularembodiments, the kits include materials and reagents necessary toconduct hybridization assays (e.g., PCR). In particular embodiments,materials and reagents expressly exclude equipment (e.g., platereaders). In particular embodiments, kits can exclude materials andreagents commonly found in laboratory settings (pipettes; test tubes;distilled H₂O).

Numerous protein and gene sequence markers are disclosed herein. Thedisclosure is not limited to the particularly disclosed protein and genesequences but instead also encompasses sequences including 80% sequenceidentity; 81% sequence identity; 82% sequence identity; 83% sequenceidentity; 84% sequence identity; 85% sequence identity; 86% sequenceidentity; 87% sequence identity; 88% sequence identity; 89% sequenceidentity; 90% sequence identity; 91% sequence identity; 92% sequenceidentity; 93% sequence identity; 94% sequence identity; 95% sequenceidentity; 96% sequence identity; 97% sequence identity; 98% sequenceidentity or 99% sequence identity.

When a protein sequence is provided, its gene sequences can be derivedby one of ordinary skill in the art by, for example, consulting publiclyavailable databases. In addition to the sequence identity parametersprovided above, gene sequences that hybridize to derived sequences underhigh stringency conditions can also be included within the scope of thecurrent disclosure. A gene or polynucleotide fragment “hybridizes” toanother gene or polynucleotide fragment, such as a cDNA, genomic DNA, orRNA, when a single stranded form of the polynucleotide fragment annealsto the other polynucleotide fragment under the appropriate conditions oftemperature and solution ionic strength. Hybridization and washingconditions are well known and exemplified in Sambrook, J., Fritsch, E.F. and Maniatis, T. Molecular Cloning: A Laboratory Manual, Second Ed.,Cold Spring Harbor Laboratory Press, Cold Spring Harbor (1989),particularly Chapter 11 and Table 11.1 therein (incorporated byreference herein for its teachings regarding the same). The conditionsof temperature and ionic strength determine the “stringency” of thehybridization. Stringency conditions can be adjusted to screen formoderately similar fragments (such as homologous sequences fromdistantly related organisms) to highly similar fragments (such as genesthat duplicate functional enzymes from closely related organisms).Post-hybridization washes determine stringency conditions. One set ofhybridization conditions to demonstrate that sequences hybridize uses aseries of washes starting with 6×SSC, 0.5% SDS at room temperature for15 min, then repeated with 2×SSC, 0.5% SDS at 45° C. for 30 min, andthen repeated twice with 0.2×SSC, 0.5% SDS at 50° C. for 30 min.Stringent conditions use higher temperatures in which the washes areidentical to those above except for the temperature of the final two 30min washes in 0.2×SSC, 0.5% SDS is increased to 60° C. Highly stringentconditions use two final washes in 0.1SSC, 0.1% SDS at 65° C. Those ofordinary skill in the art will recognize that these temperature and washsolution salt concentrations may need to be adjusted as necessaryaccording to factors such as the length of the hybridizing sequences.

Also disclosed herein is a cDNA library including mRNA isolated from (i)bronchoalveolar cells (BAL) of sarcoidosis patients; and (ii) whiteblood cells obtained from sarcoidosis patients. In further embodiments,the cDNA library further includes mRNA isolated from (iii) human splenicmonocytes; and/or (iv) embryonic lung fibroblasts. The cDNA library canbe screened for markers associated with sarcoidosis or relateddisorders. The cDNA library can be a phage display library, a ribosomedisplay library, or a nucleic acid display library. In particularembodiments, the cDNA library is a T7 phage display library. Inparticular embodiments, the cDNA library should be biopanned tonegatively select and/or enrich for detection markers of interest. Forexample, biopanning with samples from control subjects can removepotential hits that are non-specific to the condition of interest,resulting in negative selection. Biopanning with samples from subjectsof interest (e.g., subjects having a condition of interest) selectspotential hits that are specific to the condition of interest, resultingin enrichment of the cDNA library for hits of potential interest. Thesystems and methods disclosed herein include biopanning a cDNA libraryincluding mRNA isolated from (i) bronchoalveolar cells (BAL) ofsarcoidosis patients; (ii) white blood cells obtained from sarcoidosispatients; (iii) human splenic monocytes; and (iv) embryonic lungfibroblasts to negatively select for and/or enrich the library for hitsof interest.

In embodiments, the cDNA library is differentially biopanned to identifymarkers for sarcoidosis. As described above, differential biopanninginvolves biopanning by negative selection using sera from controlsubjects to remove non-specific IgG, followed by biopanning by positiveenrichment using sera from sarcoidosis patients.

Additional embodiments include adhering cDNA expression products from anegatively selected and enriched cDNA library to a microarray.Additional embodiments include exposing the microarray to samples fromsubjects of interest and control samples. Additional embodiments includedetecting cDNA expression products bound by molecules in samples fromthe subjects of interest. Additional embodiments include performing dataanalysis to identify molecules that bind cDNA expression products asmarkers of a condition of interest.

One embodiment includes detecting sarcoidosis or tuberculosis antigensby: (a) preparing a phage display library of sarcoidosis or tuberculosisantigens from cells of one or more subjects with sarcoidosis; (b)enriching the phage display library for sarcoidosis or tuberculosisantigens by biopanning; (c) selecting clones for amplification; (d)testing amplified clones for binding to antibodies in sera ofsarcoidosis subjects; and (e) sequencing bound clones.

Another embodiment includes a library and method to identify sarcoidosismarkers. One embodiment includes identifying proteins that bind toexpression products of phage display clones derived from a libraryincluding mRNA isolated from (i) bronchoalveolar cells (BAL) ofsarcoidosis patients; (ii) white blood cells obtained from sarcoidosispatients; (iii) human splenic monocytes; and/or (iv) embryonic lungfibroblasts. Another embodiment includes identifying proteins that bindto expression products of phage display clones derived from a libraryincluding mRNA isolated from (i) bronchoalveolar cells (BAL) ofsarcoidosis patients; (ii) white blood cells obtained from sarcoidosispatients; (iii) human splenic monocytes; and (iv) embryonic lungfibroblasts. Following binding, identified proteins can be characterizedand, in particular embodiments, synthesized.

These embodiments can be used to identify additional markers to diagnosesystemic sarcoidosis, pulmonary sarcoidosis, cutaneous sarcoidosis,Lofgren's syndrome, neurosarcoidosis, cardiac sarcoidosis, ocularsarcoidosis, hepatic sarcoidosis, musculoskeletal sarcoidosis, renalsarcoidosis, or sarcoidosis with the involvement of other organs ortissues.

In embodiments, diagnosis of sarcoidosis may be achieved in accordancewith the previously disclosed methods through the use of a computingdevice to provide for a quicker, more reliable, and less labor intensivediagnosis.

FIG. 10 shows an illustrative schematic 1000 for diagnosing sarcoidosisin a subject 1002 on a computing device 1008, including an illustrativediagram 1028 of a computing device 1008 implementing the diagnosticframework 1018. Sample biological material 1004 is collected from thesubject 1002. That sample 1004 may be assayed for the presence of one ormore markers. An indication of the up- or down-regulation of the markersis reflected by one or more marker values 1006 generated after assayingand analyzing the sample 1004. A computing device 1008 implementing thediagnostic framework 1018 will analyze and diagnose the subject 1002 ashealthy, having sarcoidosis, or in some embodiments, havingtuberculosis. The diagnosis is published to a user via a graphical userinterface 1026.

In embodiments, to enhance security, subject privacy, and compliancewith government regulations, subject data like the subject's markervalues 1006 may be deleted after it is used to generate a computerassisted diagnosis. Thus, the sample information will no longer exist asstandalone information on the one or more computing devices 1028implementing the diagnostic framework 1018. Thus, the only subject dataavailable to the computing device 1008 will be integrated into thediagnosis provided by the one or more computing devices.

FIG. 10 includes an illustrative diagram 1028 of the computing device1008. The computing device 1008 may contain one or more processingunit(s) 1012 and memory 1014, both of which may be distributed acrossone or more physical or logical locations. The processing unit(s) 1012may include any combination of central processing units (CPUs),graphical processing units (GPUs), single core processors, multi-coreprocessors, application-specific integrated circuits (ASICs),programmable circuits such as Field Programmable Gate Arrays (FPGA), andthe like. One or more of the processing unit(s) 1012 may be implementedin software and/or firmware in addition to hardware implementations.Software or firmware implementations of the processing unit(s) 1012 mayinclude computer- or machine-executable instructions written in anysuitable programming language to perform the various functionsdescribed. Software implementations of the processing unit(s) 1012 maybe stored in whole or part in the memory 1014.

Additionally, the functionality of the computing devices 1008 can beperformed, at least in part, by one or more hardware logic components.For example, and without limitation, illustrative types of hardwarelogic components that can be used include Field-programmable Gate Arrays(FPGAs), Application-specific Integrated Circuits (ASICs),Application-specific Standard Products (ASSPs), System-on-a-chip systems(SOCs), Complex Programmable Logic Devices (CPLDs), etc.

Computing device 1008 may be connected to a network through one or morenetwork connectors 1016 for receiving and sending information. Thenetwork may be implemented as any type of communications network such asa local area network, a wide area network, a mesh network, and ad hocnetwork, a peer-to-peer network, the Internet, a cable network, atelephone network, and the like. In embodiments, the computing device1008 have a direct connection to one or more other devices (e.g. devicesthat output subject 1002 information, like marker values 1006, inelectrical or electronic form) without the presence of an interveningnetwork. The direct connection may be implemented as a wired connectionor a wireless connection. A wired connection may include one or morewires or cables physically connecting the computing device 1008 toanother device. For example, the wired connection may be created by aheadphone cable, a telephone cable, a SCSI cable, a USB cable, anEthernet cable, or the like. The wireless connection may be created byradio frequency (e.g., any version of Bluetooth, ANT, W-Fi IEEE 802.11,etc.), infrared light, or the like.

The computing device 1008 may be a supercomputer, a network server, adesktop computer, a notebook computer, a collection of server computerssuch as a server farm, a cloud computing system that uses processingpower, memory, and other hardware resources distributed across multiplegeographic locations, or the like. The computing device 1008 may includeone or more input/output components(s) such as a keyboard, a pointingdevice, a touchscreen, a microphone, a camera, a display, a speaker, aprinter, and the like.

Memory 1014 of the computing device 1008 may include removable storage,non-removable storage, local storage, and/or remote storage to providestorage of computer-readable instructions, data structures, programmodules, and other data. The memory 1014 may be implemented ascomputer-readable media. Computer-readable media includes non-volatilecomputer-readable storage media, removable and non-removable mediaimplemented in any method or technology for storage of information suchas computer-readable instructions, data structures, program modules, orother data. Computer-readable storage media includes, but is not limitedto, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM,digital versatile disks (DVD) or other optical storage, magneticcassettes, magnetic tape, magnetic disk storage or other magneticstorage devices, or any other non-transmission medium that can be usedto store information for access by a computing device.

The computing device 1008 includes multiple modules that may beimplemented as instructions stored in the memory 1014 for execution byprocessing unit(s) 1012 and/or implemented, in whole or in part, by oneor more hardware logic components or firmware. The diagnostic framework1018 is contained within the computing device 1008 and may beimplemented as instructions stored in the memory 1014 for execution bythe processing unit(s) 1012, by hardware logic components, or both.

A scoring module 1012 obtains from an external source an indication ofthe expression of the tested markers in a sample 1004 as one or moremarker value(s) 1006. The marker values 1006 can be obtained from amicroarray or any machine connected to the computing device 1008 eitherdirectly or through the network connectors 1016. The marker values 1006may also be previously saved or stored on a separate computing device orcomputer-readable media prior to being transferred to the scoring module1020. The marker values 1008 may also be inputted directly by a user,including a physician or laboratory technician, through any appropriateI/O method. Exemplary I/O methods include any methods making use of thepreviously mentioned input/output components such as a keyboard, camera,microphone, touchscreen, or scanner.

The scoring module 1020 also obtains a reference level corresponding tothe one or more marker values 1006. As with the marker values 1006, thereference levels can be calculated, as previously explained, and storedin a reference level database 1024, on the computing device 1008. Thosehaving skill in the art will appreciate, however, that the one or morereference levels 1024 may, in other embodiments, be obtained eitherdirectly or through the network connectors 1016 from one or moreseparate computing devices, machines, or computer readable media. Thereference levels may also be directly inputted by the user.

The scoring module 1020 may partially process, normalize, rewrite,anonymize, or otherwise modify the marker values 1006 or referencelevels 1024. The scoring module 1020 will generate a score based atleast in part on the one or more marker values 1006. In some embodimentsthis score is equivalent to the one or more marker values. In otherembodiments, the score will be generated based at least in the part onthe marker values 1006 and a weight associated with each correspondingmarker. For example, markers with higher sensitivity, specificity, orboth could be weighted more heavily than markers with lower sensitivityor specificity. Alternative scores may be generated based on any otherpreviously discussed analytic process.

The scoring module 1020 provides the generated score to a diagnosticmodule 1022. The diagnostic module compares the score to the referencelevel and diagnoses the subject 1002 based on a result of the comparisonas having sarcoidosis, not having sarcoidosis, or in some embodiments,having tuberculosis. The diagnosis is published to the user via agraphical user interface 1026.

Illustrative Process: For ease of understanding, the processes discussedin this disclosure are delineated as separate operations represented asindependent blocks. However, these separately delineated operationsshould not be construed as necessarily order dependent in theirperformance. The order in which the process is described is not intendedto be construed as a limitation, and any number of the described processblocks may be combined in any order to implement the process, or analternate process. Moreover, it is also possible that one or more of theprovided operations is modified or omitted.

FIG. 11 shows an illustrative process 1100 for diagnosing sarcoidosis.

At 1102, one or more reference levels are received, as well as anindication of the expression of relevant markers in a sample. Theindication of the one or more marker values may be received from aclinician who assayed the sample for the value, or they may be receivedfrom a database where the values from a previously performed assay havebeen stored.

At 1104, a score is generated at least partly based on the marker value.The score may be the same as the marker value, or it may be additionallybased on a weight corresponding to each tested marker, or based in parton any other previously disclosed analytic process. Note that there maybe a score for each marker, or there may be a single score based on anaggregation of data related to multiple marker values.

At 1106, the score is compared to one or more reference levels.

At 1108, a subject is diagnosed based on a result of the comparison 1106as being healthy, having sarcoidosis, or in some embodiments, havingtuberculosis.

In embodiments, the subjects diagnosed with sarcoidosis or tuberculosisusing the methods disclosed herein can be effectively treated with theappropriate therapy. As an example, treating subjects with sarcoidosisincludes delivering therapeutically effective amounts of an appropriatedrug to alleviate one or more symptoms of sarcoidosis or tuberculosis.

Particular embodiments include:

Embodiment 1. A method of diagnosing sarcoidosis in a subject includingassaying a sample derived from a subject for the presence of one or moremarkers selected from Small inducible cytokine A21 precursor (CCL21);Methionine aminopeptidase 1 (Metap1); Activated RNA polymerase IItranscription cofactor variant 4 (PC4); RNA methyltransferase(CLI_3190); Tumor necrosis factor receptor superfamily member 21precursor (TNFRSF21); Monocyte differentiation antigen CD14 (CD14); DnaJ(Hsp40) homolog subfamily C member 1 precursor (DNAJC1); Amyloid β A4precursor protein-binding family B member 1-interacting protein (APBB1);Fibroblast growth factor binding protein 2 precursor (FGFBP-2); or SH3domain-containing YSC84 like protein 1 (SH3YL1); and diagnosing thesubject as healthy or having sarcoidosis based on the up- ordown-regulation of the one or more markers, as compared to a referencelevel for each marker.

2. A method of embodiment 1 including assaying the sample for thepresence of CCL21; Metap1; CLI_3190; APBB1; and SH3YL1; and diagnosingthe subject as healthy or having sarcoidosis based on the up- ordown-regulation of the one or more markers.

3. A method of embodiment 1 including assaying the sample for thepresence of CCL21; Metap1; PC4; CLI_3190; TNFRSF21; and APBB1; anddiagnosing the subject as healthy or having sarcoidosis based on the up-or down-regulation of the one or more markers.

4. A method of embodiment 1 including assaying the sample for thepresence of CCL21; PC4; CLI_3190; DNAJC1; APBB1; FGFBP-2; and SH3YL1;and diagnosing the subject as healthy or having sarcoidosis based on theup- or down-regulation of the one or more markers.

5. A method of embodiment 1 including assaying the sample for thepresence of CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1;FGFBP-2; and SH3YL1; and diagnosing the subject as healthy or havingsarcoidosis based on the up- or down-regulation of the one or moremarkers.

6. A method of distinguishing sarcoidosis from tuberculosis in a subjectincluding assaying a sample obtained from the subject for the presenceof one or markers selected from Ferredoxin (Fed A); WDFY3 protein(WDFY3); Membrane protein (MFS); Leucine rich PPR-motif containingprotein (LRPPRC); HLA-DR alpha (HLA-DR); Transketolase (TKT); Dihydroxyacid dehydratase (Rv0189C); Chain A Mycobacterium tuberculosis (BfrA);Disabled homolog 2 isoform 2 (DAB2); or Transcription elongation factorB polypeptide 2 isoform (Homo sapiens) (TCEB2); and diagnosing thesubject as healthy, having sarcoidosis or having tuberculosis based onthe up- or down-regulation of the one or more markers as compared to areference level for each marker.

7. A method of embodiment 6 including assaying the sample for thepresence of Fed A; WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2;and TCEB2; and diagnosing the subject as healthy, having sarcoidosis orhaving tuberculosis based on the up- or down-regulation of the one ormore markers.

8. A method of embodiment 6 including assaying the sample for thepresence of HLA-DR; MFS; DAB2; BfrA; and WDFY3; and diagnosing thesubject as healthy, having sarcoidosis or having tuberculosis based onthe up- or down-regulation of the one or more markers.

9. A method of embodiment 6 including assaying the sample for thepresence of HLA-DR; MFS; and DAB2; and diagnosing the subject ashealthy, having sarcoidosis or having tuberculosis based on the up- ordown-regulation of the one or more markers.

10. A kit for diagnosing sarcoidosis in a subject wherein the kitincludes a protein that binds CCL21; Metap1; PC4; CLI_3190; TNFRSF21;CD14; DNAJC1; APBB1; FGFBP-2; or SH3YL1; and a detectable label.

11. A kit according to embodiment 10 including one or more proteins thatbind CCL21; Metap1; CLI_3190; APBB1; or SH3YL1; and a detectable label.

12. A kit according to embodiment 10 including one or more proteins thatbind CCL21; Metap1; PC4; CLI_3190; TNFRSF21; or APBB1; and a detectablelabel.

13. A kit according to embodiment 10 including one or more proteins thatbind CCL21, PC4; CLI_3190; DNAJC1; APBB1; FGFBP-2; or SH3YL1, and adetectable label.

14. A kit according to embodiment 10 including one or more proteins thatbind CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1;FGFBP-2; or SH3YL1; and a detectable label.

15. A kit for distinguishing sarcoidosis from tuberculosis in a subjectwherein the kit includes a protein that binds Fed A; WDFY3; MFS; LRPPRC;HLA-DR; TKT; Rv0189C; BfrA; DAB2; or TCEB2; and a detectable label.

16. A kit according to embodiment 15 including one or more proteins thatbind Fed A; WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; orTCEB2; and a detectable label.

17. A kit according to embodiment 15 including one or more proteins thatbind HLA-DR; MFS; DAB2; BfrA; or WDFY3; and a detectable label.

18. A kit according to embodiment 15 including one or more proteins thatbind HLA-DR; MFS; or DAB2; and a detectable label.

19. A kit according to any one of embodiments 10-18 wherein the proteinsinclude antibodies, epitopes or mimotopes.

20 A kit for diagnosing sarcoidosis in a subject wherein the kitincludes a nucleic acid that binds a gene encoding CCL21; Metap1; PC4;CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; or SH3YL1; and adetectable label.

21. A kit according to embodiment 20 including one or more nucleic acidsthat bind a gene encoding CCL21; Metap1; CLI_3190; APBB1; or SH3YL1; anda detectable label.

22. A kit according to embodiment 20 including one or more nucleic acidsthat bind a gene encoding CCL21; Metap1; PC4; CLI_3190; TNFRSF21; orAPBB1; and a detectable label.

23. A kit according to embodiment 20 including one or more nucleic acidsthat bind a gene encoding CCL21, PC4; CLI_3190; DNAJC1; APBB1; FGFBP-2;or SH3YL1; and a detectable label.

24. A kit according to embodiment 20 including one or more nucleic acidsthat bind a gene encoding CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14;DNAJC1; APBB1; FGFBP-2; or SH3YL1; and a detectable label.

25. A kit for distinguishing sarcoidosis from tuberculosis in a subjectwherein the kit includes one or more nucleic acids that bind a geneencoding Fed A; WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; orTCEB2; and a detectable label.

26. A kit according to embodiment 25 including one or more nucleic acidsthat bind a gene encoding Fed A; WDFY3; MFS; LRPPRC; HLA-DR; TKT;Rv0189C; BfrA; DAB2; or TCEB2; and a detectable label.

27. A kit according to embodiment 25 including one or more nucleic acidsthat bind a gene encoding HLA-DR; MFS; DAB2; BfrA; or WDFY3; and adetectable label.

28. A kit according to embodiment 25 including one or more nucleic acidsthat bind a gene encoding HLA-DR; MFS; or DAB2; and a detectable label.

29. A kit according to any one of embodiments 10-28 wherein thedetectable label is a radioactive isotope, enzyme, dye, fluorescent dye,magnetic bead, or biotin.

30. A kit according any one of claims 10-29 wherein the kit furtherincludes reagents to perform an enzyme-linked immunosorbent assay(ELISA), a radioimmunoassay (RIA), a Western blot, animmunoprecipitation, an immunohistochemical staining, flow cytometry,fluorescence-activated cell sorting (FACS), an enzyme substrate colormethod, and/or an antigen-antibody agglutination.

31. A method of diagnosing sarcoidosis in a subject including obtaininga sample from a subject; assaying the sample for one or more markersselected from CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1;APBB1; FGFBP-2; or SH3YL1 obtaining a value based on the assay;comparing the value to a reference level; and diagnosing the subject ashealthy or having sarcoidosis based on the up- or down-regulation of theone or more markers as demonstrated by the value and the referencelevel.

32. A method according to embodiment 31 including assaying the samplefor one or more markers selected from CCL21; Metap1; CLI_3190; APBB1; orSH3YL1.

33. A method according to any one of embodiments 31 or 32 includingassaying the sample for one or more markers selected from CCL21; Metap1;PC4; CLI_3190; TNFRSF21; or APBB1.

34. A method according to any one of embodiments 31-33 includingassaying the sample for one or more markers selected from CCL21; PC4;CLI_3190; DNAJC1; APBB1; FGFBP-2; or SH3YL1.

35. A method according to any one of embodiments 31-34 includingassaying the sample for one or more markers selected from CCL21; Metap1;PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; or SH3YL1

36. A method of distinguishing sarcoidosis from tuberculosis in asubject including: obtaining a sample derived from the subject; assayingthe sample for one or more markers selected from Fed A; WDFY3; MFS;LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; or TCEB2; obtaining a valuebased on the assay; comparing the value to a reference level; anddiagnosing the subject as healthy, having sarcoidosis or havingtuberculosis based on the up- or down-regulation of the one or moremarkers as demonstrated by the value and the reference level.

37. A method according to embodiment 36 including assaying the samplefor one or more markers selected from Fed A; WDFY3; MFS; LRPPRC; HLA-DR;TKT; Rv0189C; BfrA; DAB2; or TCEB2.

38. A method according to embodiment 36 or 37 including assaying thesample for one or more markers selected from HLA-DR; MFS; DAB2; BfrA; orWDFY3.

39. A method according to any one of embodiments 36-38 includingassaying the sample for one or more markers selected from HLA-DR alpha(HLA-DR); Membrane protein (MFS); or Disabled homolog 2 isoform 2(DAB2).

40. A method according to any one of embodiments 31-39, wherein assayingthe sample for one or more markers include contacting the sample with aprobe including a detectable label, wherein the probe binds the marker.

41. A method of any one of embodiments 31-40, wherein obtaining a valuebased on the assay includes analyzing the binding of the probe to themarker in the sample.

42. A method of any one of embodiments 31-41, wherein analyzing thebinding of the probe to the marker in the sample includes quantitatingthe amount of the marker in the sample.

43. A method of any one of embodiments 31-42, wherein the sample is atissue sample, a cell sample, a whole blood sample, a serum sample, aplasma sample, a saliva sample, a sputum sample, or a urine sample.

44. A method of any one of embodiments 31-43 wherein the value is ascore.

45. A method of any one of embodiments 31-44 wherein the score is aweighted score.

46. A cDNA library including mRNA isolated from (i) bronchoalveolarcells (BAL) of sarcoidosis patients; and (ii) white blood cells obtainedfrom sarcoidosis patients.

47. A cDNA library of embodiment 46 further including mRNA isolated from(iii) human splenic monocytes; and/or (iv) embryonic lung fibroblasts.

48. A cDNA library of any one of embodiments 46 or 47 wherein the cDNAlibrary is a phage display library.

49. A cDNA library of any one of embodiments 46-48 wherein the phagedisplay library is a T7 phage display library.

50. A cDNA library of any one of embodiments 46-49 wherein cDNA fromeach cell type is linked to an identifying sequence or tag.

51. A cDNA library of any one of embodiments 46-50 wherein theidentifying sequence or tag is a modified linker selected fromEcoR1/HindIII; ALA; LEU; and THR.

52. A cDNA library of any one of embodiments 46-51 following biopanning.

53. A cDNA library of any one of embodiments 46-52 wherein thebiopanning includes negative selection and/or enrichment.

54. A method of identifying markers to diagnose sarcoidosis includingadhering cDNA expression products from a cDNA library of any one ofembodiments 46-52 to a microarray; exposing the microarray to samplesfrom sarcoidosis subjects and control subjects; detecting cDNAexpression products bound by molecules in the samples from sarcoidosissubjects but not by samples from control subjects; performing dataanalysis to identify bound molecules that reliably diagnose sarcoidosis.

55. A method of detecting sarcoidosis or tuberculosis antigens bypreparing: (a) a phage display library of sarcoidosis or tuberculosisantigens from cells of one or more subjects with sarcoidosis; (b)enriching the phage display library for sarcoidosis or tuberculosisantigens by biopanning; (c) selecting clones for amplification; (d)testing amplified clones for binding to antibodies in sera ofsarcoidosis subjects; and (e) sequencing bound clones.

56. A method of embodiment 55 wherein the cells are bronchoalveolarcells (BAL) and white blood cells.

57. A method of any one of embodiments 55 or 56 wherein the sarcoidosissubject has systemic sarcoidosis, pulmonary sarcoidosis, cutaneoussarcoidosis, Lofgren's syndrome, neurosarcoidosis, cardiac sarcoidosis,ocular sarcoidosis, hepatic sarcoidosis, musculoskeletal sarcoidosis,renal sarcoidosis, or sarcoidosis with the involvement of other organsor tissues.

58. A method of any one of embodiments 55-58 wherein the detectedsarcoidosis antigens are specific to systemic sarcoidosis, pulmonarysarcoidosis, cutaneous sarcoidosis, Lofgren's syndrome,neurosarcoidosis, cardiac sarcoidosis, ocular sarcoidosis, hepaticsarcoidosis, musculoskeletal sarcoidosis, renal sarcoidosis, orsarcoidosis with the involvement of other organs or tissues.

59. A kit to practice a method of any one of embodiments 31-45 or 54-58.

60. A method of identifying markers for sarcoidosis including preparinga cDNA library including mRNA isolated from (i) bronchoalveolar cells(BAL) of sarcoidosis patients; and (ii) white blood cells obtained fromsarcoidosis patients, (iii) human splenic monocytes; and (iv) embryoniclung fibroblasts; biopanning the cDNA library to isolate clonesexpressing antigens for sarcoidosis from the cDNA library; andidentifying the antigens as markers for sarcoidosis.

61. The method of embodiment 60, wherein biopanning includesdifferential biopanning.

62 The method of any one of embodiments 60 or 61, wherein differentialbiopanning includes using sera from healthy control subjects to removenon-specific IgG.

63. The method of any one of embodiments 60-62, wherein differentialbiopanning further includes using sarcoidosis sera for positiveenrichment.

64. The method of any one of embodiments 60-63, wherein identifying theantigens includes immobilizing the clones on a microarray; contactingthe antigens in the clones with sera of sarcoidosis patients; andanalyzing binding of the antigens to the sera.

65. The method of any one of embodiments 60-64, wherein analyzingbinding of the antigens to the sera includes quantifying the binding ofthe antigens to the sera.

66. The method of any one of embodiments 60-65, wherein the analyzingbinding of the antigens to the sera includes comparing the binding ofthe antigens to the sera of sarcoidosis patients with the binding of theantigens to the sera of healthy subjects.

67. The method of any one of embodiments 60-66, further includesidentifying markers for tuberculosis, the method further includingobtaining the clones expressing the antigens identified as markers forsarcoidosis, contacting the clones with sera from tuberculosis patientsto identify clones expressing antigens for tuberculosis, and identifyingthe antigens as markers for tuberculosis.

68. A microarray including a protein that binds CCL21; Metap1; PC4;CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; or SH3YL1.

69. A microarray including a protein that binds Fed A; WDFY3; MFS;LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; or TCEB2.

70. A microarray including a nucleic acid that binds to a gene encodingCCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; orSH3YL1.

71. A microarray including a nucleic acid that binds a gene encoding FedA; WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; or TCEB2.

72. A microarray including one or more of the following proteins: CCL21;Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; orSH3YL1.

73. A microarray including one or more of the following proteins Fed A;WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; or TCEB2.

74. The microarray of any one of embodiments 68-73, wherein the proteinor the nucleic acid on the microarray includes a label that can bedetected.

75. The microarray of any one of embodiments 68, 69, or 72-74, whereinthe microarray includes two or more, three or more, four or more, fiveor more, six or more, seven or more, eight or more, or nine or more ofthe proteins on the microarray.

76. The microarray of any one of embodiments 70, 71, or 74, wherein themicroarray includes two or more, three or more, four or more, five ormore, six or more, seven or more, eight or more, or nine or more of thenucleic acids on the microarray.

77. A kit comprising the microarray of any one of embodiments 68-76.

78. A method of treating a subject having symptoms of sarcoidosisincluding: diagnosing whether the subject has sarcoidosis, thediagnosing including, obtaining a sample from the subject, contactingthe microarray of any one of embodiments 68, 70, or 72 with the samplefrom the subject, detecting up- and/or down-regulation of a protein ornucleic acid on the microarray as compared to a reference level, therebydiagnosing the subject has sarcoidosis; and treating the subject with adrug that alleviates the symptoms of sarcoidosis.

79. The method of embodiment 78, wherein the drug is a nonsteroidalanti-inflammatory.

80. The method of embodiment 79, wherein the drug is corticosteroids,methotrexate, azathioprine, hydroxychloroquine, chloroquine,cyclophosphamide, chlorambucil, pentoxifylline, thalidomide, infliximab,adalimumab, or colchicine.

81. A method of treating a subject having symptoms of tuberculosis andsarcoidosis including: diagnosing that the subject has tuberculosis, thediagnosing including, obtaining a biological sample from the subject,contacting the protein microarray of embodiment 69, 71, or 73 or withthe biological sample from the subject, detecting up- and/ordown-regulation of a protein or nucleic acid on the microarray, therebydiagnosing the subject has tuberculosis; and treating the subject with adrug that alleviate the symptoms of tuberculosis.

82. The method of embodiment 81, wherein the drug is isoniazid (INH),rifampin (RIF), ethambutol (EMB), orpyrazinamide (PZA)). The Examplesbelow are included to demonstrate particular embodiments of thedisclosure. Those of ordinary skill in the art should recognize in lightof the present disclosure that many changes can be made to the specificembodiments disclosed herein and still obtain a like or similar resultwithout departing from the spirit and scope of the disclosure.

83. A computer-implemented method of diagnosing subjects havingsarcoidosis, the computer-implemented method including: receiving at acomputer system a value representing an expression of one or more of thefollowing markers in a subject sample: CCL21; Metap1; PC4; CLI_3190;TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; or SH3YL1; generating a scorebased at least in part on the one or more values and a weight associatedwith each of the one or more corresponding markers; comparing the scoreto a reference level; and diagnosing the subject as having sarcoidosisor not having sarcoidosis based on a result of the comparison.

84. A computing device for diagnosing sarcoidosis including: aprocessing unit; a memory; a user interface; a scoring module configuredto: receive a value representing an expression of each of one or more ofthe following markers: CCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14;DNAJC1; APBB1; FGFBP-2; or SH3YL1; and generate a score based at leastin part on the one or more values and a weight associated with each ofthe corresponding markers; and a diagnostic module configured to:compare the score to a reference level; diagnose the subject as havingsarcoidosis or not having sarcoidosis based on a result of thecomparison; and publish the diagnosis to the user interface.

85. A computer-implemented method of distinguishing sarcoidosis fromtuberculosis in a subject, the computer-implemented method including:receiving at a computer system a value representing an expression ofeach of one or more of the following markers in a subject sample: Fed A;WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; or TCEB2;generating a score based at least in part on the value and a weightassociated with each of the one or more markers; comparing the score toa reference level; diagnosing the subject as healthy, havingsarcoidosis, or having tuberculosis based on a result of the comparison;and publishing a result.

86. A computing device for diagnosing sarcoidosis including: aprocessing unit; a memory; a user interface; a scoring module configuredto: receive a reference level, and a value representing an expressioneach of one or more of the following markers in a subject sample: Fed A;WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; or TCEB2; andgenerate a score based at least in part on the value and a weightassociated with each of the one or more markers; and a diagnostic moduleconfigured to: compare the score to a reference level; diagnose thesubject as having sarcoidosis or not having sarcoidosis based on aresult of the comparison; and publish the diagnosis to the userinterface.

87. The computer-implemented method of embodiments 83 or 85, the methodincludes receiving a reference level.

88. The computer device of embodiments 84 or 86, wherein the scoringmodule receives a reference level.

EXAMPLES

Significance. Aberrant immune responses are a major cause of a vastarray of human diseases. Sarcoidosis is an inflammatory disease ofunknown etiology sharing similarities with non-infectious and infectiousgranulomatous diseases, including Mycobacteria tuberculosis.Tuberculosis (TB) remains a major global health problem. There is atremendous need to develop accurate tests to diagnose sarcoidosis andTB. A highly sensitive and specific T7 phage antigen library derivedfrom bronchoalveolar lavage cells and leukocytes of sarcoidosis subjectswas developed. This complex cDNA library was biopanned and a microarraywas constructed to immunoscreen sera from healthy, sarcoidosis and TBsubjects. A panel of specific antigens to classify sarcoidosis fromhealthy controls and subjects with TB was identified.

Introduction. Sarcoidosis is an inflammatory granulomatous disease ofunknown etiology affecting multiple organs, such as lungs, skin, CNS,and eyes. Common features shared by patients with sarcoidosis are thepresence of non-caseating granuloma, a lack of cutaneous reaction totuberculin skin testing (PPD) and increased local and circulatinginflammatory cytokines. In addition, there is evidence of abnormalimmune function that presents as cutaneous anergy accompanied byhypergammaglobulinemia. Sarcoidosis shares striking clinical andpathological similarities with infectious granulomatous diseases,especially Mycobacteria tuberculosis (MTB). Iannuzzi et al., N. Engl. J.Med. 2007; 357(21): 2153-65; Prince et al., J. Allergy Clin. Immunol.2003; 111 (2 Suppl): S613-23. Although there is mounting evidence of thepresence of nonviable bacterial components (including MTB andPropionibacterium acnes) in sarcoidosis tissue (Gupta et al., Eur.Respir. J. 2007; 30(3): 508-16; Chen et al., Am. J. Respir. Crit. CareMed.; 181(4): 360-73; Negi et al., Modern pathology: an official journalof the United States and Canadian Academy of Pathology, Inc. 2012;25(9): 1284-97) all attempts to isolate viable MTB or other microbialpathogens from sarcoidosis tissue have failed. Hunninghake et al.,Sarcoidosis Vasc Diffuse Lung Dis 1999; 16(2): 149-73; Chen et al., J.Immunol. 2008; 181(12): 8784-96.

Intradermal injection of the Kveim-Siltzbach suspension (a granulomatoussplenic tissue suspension) induces granuloma formation weeks later insarcoidosis patients suggesting the presence of antigen(s) in granulomatissue and host immunoreactivity to these antigens. Proteomics,genomics, transcriptomics, and high throughput technology clearlysuggest that early immune reaction to diverse antigens is highlyprevalent in a large number of rheumatic, neoplastic, and inflammatorydiseases such as sarcoidosis. Several studies using state-of-the-arttechnologies have attempted to identify sarcoidosis antigens or toidentify the underlying genetic and environmental factors (Hajizadeh etal., J. Clin. Immunol. 2007; 27(4): 445-54; Chen et al., Proc. Am.Thorac. Soc. 2007; 4(1): 101-7; Zhang et al., Respiratory research 2013;14: 18) yet unifying environmental or genetic factors as initiators ofthis disease have not been found. Hunninghake et al., Sarcoidosis VascDiffuse Lung Dis 1999; 16(2): 149-73; Dubaniewicz, Autoimmunity reviews2010; 9(6): 419-24; Eishi et al., J Clin Microbiol 2002; 40(1): 198-204;Oswald-Richter & Drake, Seminars in respiratory and critical caremedicine 2010; 31(4): 375-9. These studies reported a number of markersor variations in gene expression signatures, which, however, failed todiscriminate between sarcoidosis and other inflammatory or granulomatousdiseases. Koth et al., Am. J. Resp. Crit. Care 2011; 184(10): 1153-63;Maertzdorf et al., Proc. Natl. Acad. Sci. USA 2012; 109(20): 7853-8.This is partly due to the fact that several inflammatory diseases mayrespond to various antigens with activation of a similar transcriptomeand/or inflammatory gene expression profiles.

Because non-caseating granulomas, cutaneous anergy andhypergammaglobulinemia suggest an immune dysfunction in this disease, itwas hypothesized that sarcoidosis is triggered by a group of unknownantigens represented in the host immune cells. To identify the elusiveantigen(s), a heterologous cDNA library derived from bronchoalveolarcell (BAL) samples and total white blood cells (WBC) from sarcoidosispatients was developed. Both sarcoid-derived libraries were thencombined with cultured human monocytes and embryonic lung fibroblastcDNA libraries to build a complex sarcoidosis library (CSL).Furthermore, antibody recognition and random plaque selection was usedduring biopanning of the cDNA libraries to minimize the confoundingeffects of autoantibodies unrelated to sarcoidosis. It was testedwhether this novel library representing relevant antigens couldspecifically recognize high IgG titer in sera of sarcoidosis subjects.This approach has been successfully applied in biomarker discovery forthe diagnosis of lung, head and neck and breast cancer. Fernandez-Madridet al., Cancer research 2004; 64(15): 5089-96; Fernandez-Madrid et al.,Clinical cancer research: an official journal of the AmericanAssociation for Cancer Research 1999; 5(6): 1393-400; Lin et al., Cancerepidemiology, biomarkers & prevention: a publication of the AmericanAssociation for Cancer Research, cosponsored by the American Society ofPreventive Oncology 2007; 16(11): 2396-405. A feature that distinguishesthe described methods from previous studies is that the exquisite powerof antibody recognition present in the sera of sarcoidosis patients wasused to interrogate the potential antigens presented in the macrophagesand monocytes.

The present study describes a novel approach to identify sarcoidosisantigens and to detect serum antibodies on high-throughput arrays. Serafrom 3 cohorts (sarcoidosis, controls, and TB) were used forimmunoscreening. Using bioinformatics tools, a large number ofbiomarkers with high sensitivity and specificity that can discriminateamong the sera of patients with sarcoidosis, healthy controls and MTBwas identified. Using the integrative-analysis method that combinesresults from two independent trials, clones that significantlydifferentiate sarcoidosis from controls were identified. Similarly,clones that differentially react with TB sera and not with sarcoidosisor control sera were identified. Furthermore, the top 10 discriminatingantigens for TB and sarcoidosis were sequenced and homologies wereidentified in a public data base. These data indicate that a uniquelibrary enabling the detection of highly significant antigens todiscriminate between patients with sarcoidosis and tuberculosis wasdeveloped.

Materials and Methods. Chemicals. All chemicals were purchased fromSigma-Aldrich (St. Louis, Mo.) unless specified otherwise. LeukoLOCKfilters and RNAlater were purchased from Life Technologies (GrandIsland, N.Y.). The RNeasy Midi kit was obtained from Qiagen, (Valencia,Calif.). The T7 mouse monoclonal antibody was purchased from Novagen(San Diego, Calif.). Alexa Fluor 647 goat anti-human IgG and AlexFluorgoat anti-mouse IgG antibodies were purchased from Life Technologies(Grand Island, N.Y.).

Patient selection. This study was approved by the Institutional ReviewBoard at Wayne State University and the Detroit Medical Center. Patientswere recruited at the center for Sarcoidosis and Interstitial LungDiseases (SILD), which is a referral center for patients withsarcoidosis and other ILDs. Three sources of patient derived materialshave been used in this study: A) a BAL cDNA library was derived from BALcells obtained during diagnostic bronchoscopy from newly diagnosedpatients with sarcoidosis (n=20); B) a leukocyte cDNA library weredeveloped from sarcoidosis patients who were followed in outpatientsetting with various stages of sarcoidosis (n=36); and C) sera collectedfrom 3 groups: 1) healthy controls, who were volunteers recruited fromthe community; 2) subjects with biopsy confirmed sarcoidosis who werefollowed in an outpatient setting; and 3) sera from subjects withculture positive TB collected at the Detroit Department of Health andWellness Promotion. Subjects were included who had a diagnosis ofsarcoidosis as proven by tissue biopsy per guidelines (Costabel &Hunninghake, Eur Respir J 1999; 14(4): 735-7) and have a negative PPD.TB subjects were included who had a positive TB culture and were HIVnegative. Subjects were excluded, who were positive for HIV or werereceiving high dose immune suppressive medication that was defined asprednisone more than 15 mg alone or in combination with immunemodulatory medications. Subjects who had positive PPD or quantiferontest were excluded from the sarcoidosis group. All study subjects signeda written informed consent.

Bronchoalveolar lavage: BAL cells were obtained, after informed consent,during diagnostic bronchoscopy from subjects with active sarcoidosis aspreviously described. Rastogi et al., American journal of respiratoryand critical care medicine 2011; 183(4): 500-10. BAL cells weresuspended in 500 μl of RNAlater and stored at −80° C.

Collection of total leukocytes from sarcoid subjects. Leukocytes from 36sarcoid subjects were isolated using whole blood with LeukoLOCK filtersas previously described. Glatt et al., Current pharmacogenomics andpersonalized medicine 2009; 7(3): 164-88.

Human macrophage (EL-1) and human lung embryonic fibroblast (MRC-5) cellcultures. Both cell lines were obtained from ATCC and cultured as perATTC recommendations. From each cell line 1-2 mg RNA was isolated toconstruct the cDNA library.

Serum collection. Using standardized phlebotomy procedures blood sampleswere collected and allowed to clot and then centrifuged at 2500 rpm for10 min. Supernatants were stored at −80° C.

Construction of T7 phage display cDNA libraries. Total RNA was isolatedusing the RNeasy Midi kit (Qiagen, Valencia, Calif.). Integrity of theRNA samples was assessed using the Agilent 2100 bioanalyzer. Total RNA,in the amount of 1-2 mg, was subjected to two cycles of polyApurification to minimize ribosomal RNA contamination as suggested by themanufacturer (Qiagen, Valencia, Calif.). The construction of phage cDNAlibraries was performed using Novagen's Orient Express cDNA Synthesis(Random Primer System) and Cloning system as per manufacturer'ssuggestions (EMD Biosciences-Novagen). Each library was cloned usingmodified linkers that allow identification of the phage clones.Chatterjee et al., Cancer research 2006; 66(2): 1181-90. The number ofclones in each of the 4 libraries was titrated by plaque assay as permanufacturer's instructions (EMD Biosciences-Novagen). Finally, the samenumber of phages from each BAL, WBC, EL-1 and MRC5 library was pooled togenerate a complex sarcoid library (CSL).

Biopanning of T7 phage displayed cDNA library with human sera.Differential biopanning for negative and positive selection wasperformed using sera from healthy controls to remove the non-specificIgG, and sarcoidosis sera for selective enrichment according tomanufacturer's suggestions (T7Select System, TB178; EMDBiosciences-Novagen). Protein G Plus-agarose beads (Santa CruzBiotechnology) were used for serum IgG immobilization. Four rounds ofbiopannings were performed and the selected phage libraries were usedfor microarray immunoscreening. Each cycle of biopanning includedpassing the entire phage library through protein G beads coated with IgGfrom pooled sera of healthy controls, then passing through beads coatedwith IgGs from individual serum of sarcoid subjects. Microarrayconstruction and immunoscreening. Informative phage clones were randomlypicked and amplified after several rounds of biopannings and theirlysates were arrayed in quintuplicates onto nitrocellulose FAST slides(Grace Biolabs, OR) using the ProSys 5510TL robot (CartesianTechnologies, CA). The nitrocellulose slides were then blocked with asolution of 1% BSA in PBS for 1 hour at room temperature followed byanother hour of incubation with serum at a dilution of 1:300 in 1×PBS orplasma at a dilution of 1:100 as primary antibodies, together with mouseanti-T7 capsid antibody (0.15 μg/mL) and BL21 E. coli cell lysates (5μg/mL). BL21 E. coli cell lysates were added to remove antibodiesspecific to E. coli from the serum. The microarrays were then washedthree times at room temperature with a solution of PBS/0.1% Tween20 for4 minutes. Secondary antibodies included goat anti-human IgG Alexa Fluor647 (red fluorescent dye) 1 μg/mL and goat anti-mouse IgG Alexa Fluor532 (green fluorescent dye) 0.05 μg/mL. After 1 hour incubation in thedark, the microarrays were washed 3 times with a solution of PBS/0.1%Tween20 for 4 minutes at room temperature, and 2 times in PBS for 4minutes at room temperature and then air dried.

Sequencing of phage cDNA clones. Individual phage clones were PCRamplified using T7 phage forward primer (SEQ ID NO. 75) and reverseprimer (SEQ ID NO. 76) and sequenced by Genwiz (South Plainfield, N.J.),using T7 phage sequence primer (SEQ ID NO. 77).

Data acquisition and pre-processing. Following the immunoreaction, themicroarrays were scanned in an Axon Laboratories 4100 scanner (PaloAlto, Calif.) using 532 and 647 nm lasers to produce a red (Alexa Fluor647) and green (Alexa Fluor 532) composite image. Using the ImaGene 6.0(Biodiscovery) image analysis software, the binding of each sarcoidspecific peptide with IgGs in each serum was then analyzed and expressedas a ratio of red-to-green fluorescent intensities. The microarray datawere further read into the R environment v2.3.0 (Team RDC. R: a languageand environment for statistical computing. R Foundation for StatisticalComputing; Vienna (Austria). 2004) and processed by a sequence ofpre-processing, including background correction, omission of poorquality spots and log 2 transformations. Within array loessnormalization was performed for each spot and summarized by median oftriplicates and followed by between array quantile normalization.

Statistical analysis. A microarray analysis was performed using serafrom sarcoid and healthy controls in two independent sets ofexperiments. Technical and biological sources of variation were expectedin the design of the experiment. As opposed to pooling all datasets, onepowerful and robust method is to integrate results from individualdatasets. Obtaining a higher confidence list of markers than by usingindividual datasets was expected. To detect differentially expressedantigens between sarcoidosis samples and healthy controls, anintegrative analysis of two datasets was performed. Limma's empiricalBayes moderated t-test identified fold-changes in expression of antigensthat differed significantly between sarcoidosis and controls for eachdataset separately. Then an integrative-analysis method—anadaptively-weighted method with one-sided correction (AW-OC) (Li &Tseng, The Annals of Applied Statistics 2011; 5(2A): 994-1019) wasperformed to combine the statistics from both datasets. The integrativemethod was designed to test whether an antigen is consistently up- ordown regulated in sarcoidosis subjects in both datasets. False DiscoveryRate (FDR) was estimated using the Benjamini-Hochberg method. Benjamini& Hochberg. J. R. Stat. Soc. Ser. B 1995; 57: 289-300.

To identify a panel of markers that classify sarcoidosis samples andcontrols, a strategy of univariate marker selection followed bymultivariate modeling was used. The top antigens differentiallyexpressed in the two groups were selected using the above describedAW-OC approach. The top genes that were consistently up- ordown-regulated in both datasets were used. The top markers were thenrequired by the supervised classification models to achieve the mostsensitivity and specificity in differentiating sarcoid and controls. Themultivariate classification models chosen for this study were K-nearestneighbors (KNN) and support vector machine (SVM). The cross-validationtechnique was used to prevent the overfitting of data analysis due to alarge number of antigens used to discriminate between sarcoid andcontrol subjects. The study was performed in two nested 10-foldcross-validation loops, an inner loop to select the optimal number ofantigens and an outer loop to measure the optimized model performancewith estimation of the area under the receiver operating characteristic(AUROC) sensitivity and specificity. The receiver operatingcharacteristic curves were estimated through 10-fold cross-validation. Amoderated t-test was carried out to identify the significant clonesbetween healthy controls, sarcoidosis and tuberculosis.

Results. Generation of cDNA libraries representative of sarcoidosisantigens. Both PBMCs and alveolar macrophages (AMs) play an importantrole in initiation of sarcoidosis granuloma. It has been shown thatextracts from sarcoidosis BAL cells and peripheral blood monocytes(PBMCs) are able to initiate a Kveim-like reaction. Siltzbach & Ehrlich,The American Journal of Medicine 1954; 16(6): 790-803; Holter et al.,The American Review of Respiratory Disease 1992; 145(4 Pt 1): 864-71.Therefore, total BAL cells and WBCs from patients with biopsy provensarcoidosis were used to develop a cDNA antigen library. BAL cells andWBC were used as sources of antigens in order to increase the diversityof sarcoidosis antigens. To increase the chance of identifyingsarcoidosis antigen(s), RNA was isolated from BAL samples obtained from20 patients with active sarcoidosis to generate the BAL cDNA library.The patients' characteristics are shown in Table 1 (left panel). TheLeukoLock system was used to isolate RNA from total leukocytes (WBC)obtained from a different cohort of 36 sarcoidosis subjects to build theWBC cDNA library. The patients' characteristics are shown in Table 1(right panel).

TABLE 1 Subject Demographics, Chest X-Ray Stages, and organ involvementsBAL derived RNA Leukocyte derived RNA Age (Mean ± SEM) 30 ± 8 Age (Mean± SEM) 36 ± 11.2 BMI (Mean ± SEM) 27.7 ± 8.7 BMI (Mean ± SEM) 31 ± 5.4 Gender, N (%) Gender, N (%) Male  7 (33) Male 12 (33) Female 13 (67)Female 24 (67) Race, N (%) Race, N (%) African American 17 (87) AfricanAmerican 32 (88) White  3 (13) White  4 (12) CXR stage, N (%) CXR Stage,N (%) 1 2 (6) 1 1 (3) 2 14 (67) 2 13 (41) 3  4 (27) 3 12 (37) 4  0 4  6(19) Lung 18 Lung 33 Extrapulmonary 16 Extrapulmonary 31Neuro-ophthalmologic  6 Neuro-ophthalmologic 11 Skin  6 Skin 13 Liver  2Liver  4 Heart  1 Heart  2 Prednisone  1 Prednisone  3 IMD  0 IMD 14Smoking Smoking None 12 None 26 Age, BMI and disease duration values arepresented as means and variability in SD or range where indicated. N =Number of patients and percent shown in parentheses. IMD =Immunomodulatory drugs

Two other sources of cDNA, one from cultured human splenic monocytes(EL-1) and another from lung embryonic fibroblasts (MRC5) were used togenerate two additional libraries. These sources were added to increasethe chance of discovering potential sarcoidosis antigens. Each cDNAunderwent two cycles of PolyA selection to minimize ribosomalcontamination. These four libraries were developed as described in theMaterials and Methods section. Each library was cloned using modifiedlinkers; ECOR1/HindIII was used for BAL cDNA, ALA for WBC cDNA, LEU forMARC5 cDNA and THR for EL1 cDNA (FIG. 6). The use of these linkersenabled identification of the original library for each antigen.

Differential biopanning of sarcoidosis phage cDNA display libraries. Thefour phage cDNA display libraries (BAL, WBC, EL-1 and MARC5) werecombined to generate a complex sarcoidosis library (CSL). To isolate alarge panel of antigens, differential biopanning of the T7 phage cDNAdisplay library was performed on the combined complex sarcoid library. Anegative biopanning selection was done using 10 pooled sera from healthycontrols to remove non-specific IgG, while 2 sarcoidosis sera were usedfor positive selective enrichment. One serum was obtained from a woman(P51) with systemic sarcoidosis who had uveitis and another serum wascollected from a male subject (P197) who had active systemic sarcoidosiswith renal involvement. Both patients had pulmonary involvements. Eachclone was derived either from P51 or from P197. The titer of the complexlibrary was assessed (FIG. 7A) and individual phage clones wereamplified by PCR (FIG. 7B).

High-throughput protein microarray immunoreaction to select sarcoidosisspecific antigens. A total of 1152 potential antigen antigens wererandomly selected from the two highly enriched pools of T7 phage cDNAlibraries (FIG. 1). These antigen antigens were robotically spotted onnitrocellulose Fast slides and were hybridized with sera of sarcoidosispatients or healthy controls. The binding of each of the arrayedpotential sarcoidosis-specific peptides with antibodies in sera wasquantified with Alexa Fluor 647 (red-fluorescent dye)-labeled goatanti-human antibody. The amount of phage particles at each spotthroughout the microarray was detected using a mouse monoclonal antibodyto the T7 capsid protein and quantified using Alexa Fluor 532(green-fluorescent dye)-labeled goat anti-mouse antibody (FIG. 1). Tocorrect for any small variation in the amount of antibody binding ineach spot that may be due to different amounts of phage spotted on themicroarray, the ratio of intensity of Alexa Fluor 647 over Alexa Fluor532 was calculated for each spot. Following immunoreaction, themicroarray data were processed by a sequence of transformations and thenanalyzed. The intra-assay reproducibility was assessed by comparing theresults among five replicates printed within the same chip for eachclone.

Selection of a panel of antigens and estimation of neural networkclassifier performance in sarcoidosis. A novel aspect of the describedwork was the integration of data from two independent trials of printingallowing the development of two data sets obtained from two independentcohorts of sarcoidosis patients and healthy controls utilized forhybridization. To generate the first dataset, sera from 54 sarcoidosissubjects and 45 healthy controls were immune-screened against 1152sarcoidosis specific peptides. In a second dataset, sera from 19 healthycontrols and 61 sarcoidosis subjects were similarly immune-screened with1152 potential sarcoidosis specific antigens. Sera used in both datasets for hybridization had not been previously used for biopanning orselection of clones. Table 2 shows the clinical characteristics ofsarcoidosis and healthy control subjects.

TABLE 2 Patient characteristics Control Subjects Age 29.7 ± 13.4 y 33 ±7.4 BMI 29 ± 10.4 28 ± 3.6 Gender, N Female 87 (75) 48 (75) Male 28 (25)16 (25) Race, N African American 107 (89)  44 (69) White  8 (11) 20 (31)CXR stage, N 0 3 (2) NA 1 18 (15) NA 2 49 (43) NA 3 45 (39) NA OrganInvolvements, Neuro-ophthalmologic 33 (28) NA Lung 109 (94)  NA Skin 50(45) NA Multiorgan 70 (52) NA Some Patients had multiple organinvolvements NA = Not Applicable

Within array loess normalization was performed for each spot andsummarized by median of triplicates and followed by between arrayquantile normalization. After preprocessing, 1101 antigens common inboth datasets were used for further analysis. Univariate andmultivariate analyses were performed. Limma's empirical Bayes moderatedt-test was used to identify fold-changes in expression of antigens thatdiffered significantly between sarcoidosis and controls for each datasetseparately. Then both datasets were combined using anintegrative-analysis method—an adaptively-weighted method with one-sidedcorrection (AW-OC). Li & Tseng, The Annals of Applied Statistics 2011;5(2A): 994-1019. Out of the 1101 potential antigen, 259 showed a strongdifferentiation between sarcoidosis and healthy control subjects withadjusted p value (q value)<0.05 and FDR (false discovery rate)<0.05.FIG. 2A shows the heatmap of the 259 significant antigens that weredifferentially expressed in both datasets. Seventy eight markers out of259 were consistently up- or down-regulated in sarcoidosis subjects.FIG. 2B shows the AUROC for this classifier. KNN method performedslightly better than SVM. Using the highly significant 32 antigensselected by AW.OC and KNN methods to classify sarcoidosis and healthycontrols (AW.OC+KNN), the area under the curve (AUROC) was 0.78, with asensitivity of 89% and a specificity of 83% estimated after 10-foldcross-validation (FIG. 2B).

Characterization of 10 most significant sarcoid antigens. Based on theresults of AW-OC integrative-analysis, the top 10 high performanceantigens that predict sarcoidosis were identified. To furthercharacterize the performance of each clone, the AU-ROC, and sensitivityand specificity given the optimal cutoff of the clones was calculated.FIG. 3 depicts the ROC curves for individual sarcoid antigens and theiradjusted p value (q value). As shown, each antigen has a differentspecificity and sensitivity as well as ROC to predict the presence ofsarcoidosis. ROC for these antigens ranged from the highest of 0.84 tothe lowest of 0.7. Nine of 10 antigens were clearly up-regulated,whereas one was down-regulated. To further characterize the identifiedantigens, these 10 highest ranked antigens were sequenced. Afterobtaining the sequences of clones, the Expasy program was used totranslate the cDNA sequences to protein sequences. Protein blast usingBlastn and tblastn algorithms of the BLAST program were applied toidentify the highest homology to identified proteins or peptides andthese results were compared with corresponding nucleotide sequencesusing nucleotide blast. The predicted amino acid in frame with phage T7gene 10 capsid proteins was also determined. Five Antigens (PC4, SAMDHI,DNAJC1, TPT1 and SH3YL1) among the top 10 fit the definition of anepitope containing known gene products in the reading frame known genes.The other five contained peptides coded by the inserted gene fragmentsleading to out of frame peptides, which fits the definition ofmimotopes. Among the 10 high performance clones, nine were up-regulatedand only one was down-regulated in sarcoidosis versus healthy controls.FIG. 8 shows the full length of proteins and genes of 10 sarcoidosisclones. Without being bound by theory, as sarcoidosis sera reacted tothese out of frame peptides, it is likely that these clones representsarcoidosis antigens produced as a result of altered reading frames oralternative splicing. Interestingly, when a similar technique wasapplied to discovery of cancer antigens, numerous out of frame peptideswere discovered. Lin et al., Cancer epidemiology, biomarkers &prevention: a publication of the American Association for CancerResearch, cosponsored by the American Society of Preventive Oncology2007; 16(11): 2396-405. Table 3 shows the 10 most significantsarcoidosis antigens, gene names and q-values.

Up-Regulated in Sensitivity // Sarcoidosis Gene q Value SpecificityClone Vs Healthy Name AUC %, 95% Cl P51_BP3_287 Small inducible CCL211.9 × 10⁻²⁰ 78 // 82 (MRC5) cytokine A21 precursor 0.84 P51_BP3_281Methionine aminopeptidase 1 Metap1 1.0 × 10⁻²⁰ 70 //82 (BAL) 0.78P51_BP4_388 Activated RNA PC4 0.00045 70 // 74 (EL-1) polymerase IItranscription 0.75 cofactor variant 4 P51_BP4_596 RNA methyltransferaseCLI_3190 0.00045 72 // 74 (WBC) 0.72 P51_BP4_566 Tumor necrosis factorTNFR 0.0009 70 // 71 (WBC) receptor superfamily member SF21 0.74 21precursor. Also known as death receptor 6 (DR6) P51_BP3_283 Monocytedifferentiation CD14 0.0009 68 /165 (WBC) antigen CD14 0.74 P51_BP3_47DnaJ (Hsp40) homolog DNAJC1 0.002 60 // 82 (EL-1) subfamily C member 10.72 precursor P197_BP4_885 Amyloid β A4 precursor APBB1 0.007 75 // 82(BAL) protein-binding family B 0.79 member 1-interacting proteinP51_BP4_577 Fibroblast growth factor FGFBP- 0.009 64 // 68 (BAL) bindingprotein 2 precursor 2 0.70 Down-Regulated Sensitivity In Sarcoidosis vsGene q Value & Specificity Clone Healthy Controls Name & AUC %, 95% ClP197_BP4_755 SH3 domain-containing SH3YL1 1.0 × 10⁻²⁰ 65 //82 (BAL)YSC84 like protein 1 0.77

Complex sarcoidosis library detects novel antigens in the sera oftuberculosis patients. In view of the clinical and pathologicalsimilarities between MTB and sarcoidosis, a most useful clinicalantigen(s) should discriminate between these two conditions. To thisend, using the antigens identified by biopanning the CSL library amicroarray was constructed, then this construct was interrogated withsera from 17 culture positive MTB subjects. Using a moderate t-test anda q value<0.05 in this system, 238 clones differentially expressedbetween TB and healthy controls and 380 clones differentially expressedbetween TB and sarcoidosis were identified. FIG. 4 shows a Venn diagramdepicting the overlap between 259 sarcoidosis markers, 238 TB vs.control and 380 TB vs. sarcoidosis markers. Clearly, 47 clonesdifferentiate both sarcoidosis and TB from healthy controls, while 5 ofthem cannot differentiate sarcoidosis from TB significantly. From theseclones, 164 were found to be TB specific, and different from bothhealthy controls and sarcoidosis clones. FIG. 5 show the heatmap of 50significant clones differentially expressed in all three groups.Similarly to the sarcoidosis antigens, the specificity and sensitivityof TB clones was analyzed to predict the presence of TB (Table 4).Finally, 10 TB antigens were sequenced and sequence homologies weresearched using the same algorithm as previously described. Table 4 showsthe 10 TB-specific antigens as compared to healthy controls as well assarcoidosis.

Up-Regulated Sensitivity // in TB vs Gene Specificity Clone SarcoidosisSubjects Name q Value AUC %, 95% Cl P51_BP3_174 Ferredoxin(Mycobacterium Fed A 4.9 × 10⁻¹⁵  0.87 88 // 83 (MRC5) tuberculosis)P51_BP4_610 WDFY3 protein WDFY3 4.1 × 10⁻¹²  0.92 88 // 84 (BAL)(Homosapiens) P51_BP3 _266 Membrane protein MFS 6.7 × 10⁻¹⁰ 0.9 82 // 93(EL-1) (Mycobacterium tuberculosis) P51 _BP3_166 Leucine rich PPR-motifLRPPRC 1.3 × 10⁻⁹   0.81 71 // 90 (BAL) containing protein (Homosapiens)P51_BP4_704 HLA-DR alpha HLA-DR 1.1 × 10⁻⁸   0.89 94 // 83 (BAL)(Homosapiens) P197_BP4_763 Transketolase TKT 2.7 × 10⁻⁶   0.86 82 // 76(BAL) (Mycobacterium tuberculosis) P51_BP4_563 Dihydroxy acid Rv0189C1.04 × 10⁻⁶   0.85 76 // 86 (BAL) dehydratase (Mycobacteriumtuberculosis) Down-Regulated in TB vs Clone Sarcoidosis SubjectsP51_BP3_113 Chain A Mycobacterium BfrA 1.2 × 10⁻¹⁹ 0.9 88 // 85 (BAL)tuberculosis P51_BP3_200 Disabled homolog 2 DAB2 1.5 × 10⁻⁹   0.92 82 //91 (BAL) isoform 2 (Homosapiens) P51_BP4_622 Transcription TCEB2 6.9 ×10⁻⁷   0.89 82 // 89 (BAL) elongation factor B polypeptide 2 isoform(Homosapiens)

After sequence analysis and homology search, one identical sequencebetween TB and sarcoidosis clone was identified. Although the identifiedclone's name was different: P51_BP3_287 versus P51_BP3_174, and theyperformed differently in sarcoidosis versus TB as indicated in q value(compare Table 3 and Table 4). However, using NCBI blast databases(Mycobacterium toxoid and the universal blast) on the same sequence, twodifferent proteins could be identified. FIG. 9 shows the full length ofprotein and genes of 10 TB antigens. Surprisingly, TB clones show muchhigher sensitivity and specificity; similarly the AUROC was larger forthe majority of TB antigens (Table 4).

Discussion. The described work was inspired by the classic observationthat the intradermal injection of a suspension of granulomatous splenictissue (Kveim-Siltzbach test) induces granuloma formation weeks later inpatients with sarcoidosis, suggesting the presence of antigen(s) ingranuloma tissue and host immunoreactivity to those antigen(s).Kveim-like effects have also been observed using non-viable BAL cellextracts or PBMCs derived from sarcoidosis subjects. Several studieshave attempted to identify specific antigens that can discriminatesarcoidosis from normal subjects or from patients with othergranulomatous diseases such as TB (Hajizadeh et al., J. Clin. Immunol.2007; 27(4): 445-54; Chen & Moller, Proc. Am. Thorac. Soc. 2007; 4(1):101-7) but, most of these studies used limited proteomics or genomics tosearch for tissue antigens. Hajizadeh et al., J. Clin. Immunol. 2007;27(4): 445-54; Richter et al., Am. J. Resp. Crit. Care 1999; 159(6):1981-4; Song et al., The Journal of Experimental Medicine 2005; 201(5):755-67. Here, using novel high throughput technology, the current gapwas overcome by constructing phage-protein microarrays in which peptidesderived from a unique sarcoidosis cDNA library were expressed as asarcoidosis phage fusion protein. The phage-protein microarrays werescreened to identify phage-peptide clones that bind antibodies in serumsamples from patients with sarcoidosis but not in those from controls.Importantly, the same microarray constructs were immune-screened usingsera of culture positive TB patients.

The average length of identified peptides for sarcoidosis antigens wasbetween 9-130 amino acids (AA), while the average peptide length for TBantigens was 9-209 AA. Among 10 sarcoidosis specific phage peptides, 5expression sequence tags with in frame epitopes were identified. Fiveother reactive antigens were relatively short out of frame peptidesmeeting the criteria to be considered as mimotopes (mimetic sequence ofa true epitope). Similarly, among 10 sequenced TB specific phagepeptides, 5 in frame epitopes with full length in frame proteins withhomology to known human sequences were identified. Five other sequenceswere relatively short peptides with homology to various known MTBproteins (Table 4).

Interestingly, TB antigens had much higher specificity and sensitivityas compared to antigens selective to sarcoidosis as indicated by higherAUCs (Table 4). Although the significance of mimotopes is not clear, ithas been shown that some out of frame peptides are immunogenic and canactivate MHC class I molecules. Due to smaller peptide sequences ofmimotopes, they may have homology with diverse proteins. Prior studiesusing similar techniques in various cancers had similarly identified outof frame peptides. Lin et al., Cancer epidemiology, biomarkers &prevention: a publication of the American Association for CancerResearch, cosponsored by the American Society of Preventive Oncology2007; 16(11): 2396-405; Wang et al., Autoantibody signatures in prostatecancer. N. Engl. J. Med. 2005; 353(12): 1224-35; Chatterjee et al.,Cancer Research 2006; 66(2): 1181-90. Detection of mimotopes in thedescribed methods may be due to out of frame peptide synthesis secondaryto altered ribosomal function, or may correspond to open reading frames,or generation of displayed peptides due to competition for bindingduring phage selection during phage insertion.

Although the primary goal was to identify the immune signature insarcoidosis, a panel of antigens differentially expressed in sarcoidosisand tuberculosis as compared to healthy subjects was also identified.Tables 3 and 4 summarize the 10 most significant clones identified insarcoidosis and tuberculosis respectively.

In recent years several groups have attempted to identify specificsignatures to distinguish between tuberculosis and sarcoidosis usingtranscriptomics or gene expression profilings. Koth et al., Am. J. Resp.Crit. Care 2011; 184(10): 1153-63; Maertzdorf et al., Proc. Natl. Acad.Sci. USA 2012; 109(20): 7853-8; Berry et al., Nature 2010; 466(7309):973-7. Yet most of these methods led to the discovery of a series ofmarkers or expression signatures that failed to discriminate betweenthese two diseases. Koth et al., American journal of respiratory andcritical care medicine 2011; 184(10): 1153-63; Stone et al., PLoS One2013; 8(1): e54487. This is partly due to the fact that severalinflammatory or infectious diseases such as CD, lupus, sarcoidosis andtuberculosis may respond to various antigens with activation of similartranscriptomes and/or inflammatory gene expression profiles. Forinstance, Maertzdorf et al. found more similarity in the activatedpathways than differences between sarcoidosis and MTB. Proc. Natl. Acad.Sci. USA 2012; 109(20): 7853-8. Their results in sarcoidosis weresimilar to those results by Berry indicating the importance of theinterferon pathway (IFN) signature in MTB. Maertzdorf et al., Proc.Natl. Acad. Sci. USA 2012; 109(20): 7853-8; Berry et al., Nature 2010;466(7309): 973-7. In addition, considerable pathway overlap wasidentified between lupus, sarcoidosis and TB. Maertzdorf et al., Proc.Natl. Acad. Sci. USA 2012; 109(20): 7853-8. However, despite similargenetic or transcriptomic signatures, these diseases are clinicallyentirely different and require different therapy. Tuberculosis, a globalinfectious disease caused by the intracellular bacterium Mycobacteriumtuberculosis remains a worldwide health problem (http:www.who.int). Onebarrier for eradication of tuberculosis besides the lack of effectivevaccination is the lack of reliable antigen to evaluate the activity ofthe disease and its response to treatment. Nahid et al., Am. J. Resp.Crit. Care 2011. 184(8): 972-9. Standard methods to diagnose TB and tomonitor response to treatment rely on sputum microscopy and culture. Thecurrent CDC/NIH roadmap emphasizes the need for development of new TBantigens as alternative methods. Nahid et al., Am. J. Resp. Crit. Care e2011. 184(8): 972-9. In view of this background, perhaps surprisingly,the described microarray platform could discriminate tuberculosis fromsarcoidosis and healthy controls. In addition to antigens forsarcoidosis, more than 300 clones specifically for tuberculosis weredetected. Interestingly, a considerable number of these clones were TBspecific and related to bacterial growth of Mycobacterium tuberculosis,and its metabolism (Table 4). Recently a tremendous effort has been puttoward elucidating the antibody response to MTB antigens, which hasimplications for the development of new antigens to diagnose and monitorsuccessful treatment, as well as to develop effective vaccination.Kunnath-Velayudhan et al., Proc. Natl. Acad. Sci. USA 2010; 107(33):14703-8. Yet, a consistent immune response to MTB has not been found.Most other studies searching for antigens in TB have identifiedunspecific markers primarily involving host response such as C-reactiveprotein or serum amyloid A and others, but not MTB specific antigens.Agranoff et al., Lancet 2006; 368(9540): 1012-21; De Groote et al., PLoSOne 2013; 8(4): e61002. MTB has the ability to survive within hostmacrophages, largely escaping immune surveillance and maintaining itsability for replication and person to person transmission. Meena &Rajni, The FEBS journal 2010; 277(11): 2416-27.

The primary goal of the described project was to discover antigensrelated to sarcoidosis. Yet, in addition specific antigens for TB weredetected. These results are surprising, as the question remains, how canthe sarcoidosis library detect TB specific antigens? Lungs areenvironmentally highly exposed to numerous bacteria, and the describedlibrary is predominantly derived from BAL cells that contain all typesof immune cells, including macrophages that might have integratedmessages from MTB. Without being bound by theory, this could be thereason why the CSL was able to detect TB specific antigens. Still, themajor question is why BAL cells of patients with sarcoidosis can harborMTB messages, yet respond to PPD skin testing with anergy, as all donorswith sarcoidosis were PPD negative.

Similar to gene-expression profiling and the pattern-recognitionapproaches utilizing serum proteomics, the described methods may havethe limitations of background signals, and sample-selection bias. Tominimize these problems, an integrative-analysis method, anadaptively-weighted statistical method on two sets of data acquired intwo independent experiments was applied. The discriminatory power ofantibody signatures was validated by analyzing data from two completelydifferent cohorts of patients.

In summary, a novel T7 phage display library derived from macrophagesfrom BAL, monocytes from blood leukocytes of patients with sarcoidosisthat may display a significant segment of the universe of potentialsarcoidosis and MTB antigens that can be specially recognized by highIgG antibodies in sarcoidosis and MTB sera was developed. The describedresults support the hypothesis that sarcoidosis sera can recognizeantigens presented in sarcoidosis materials. Current study of theantibody response can advance how proteomics can be used to harnessimmunity to identify and treat diseases, because it investigatesantibody—antigen interactions and also evaluates the effects on antibodyresponses of pathogen and host characteristics.

Standard reference works setting forth the general principles ofimmunology include Abbas et al., Cellular and Molecular Immunology (6thEd.), W.B. Saunders Co., Philadelphia, 2007; Janeway et al.,Immunobiology. The Immune System in Health and Disease, 6th ed., GarlandPublishing Co., New York, 2005; Delves et al. (eds.) Roitt's EssentialImmunology (11th ed.) Wiley-Blackwell, 2006; Roitt et al., Immunology(7th ed.) C.V. Mosby Co., St. Louis, Mo. (2006); Klein et al.,Immunology (2nd ed), Blackwell Scientific Publications, Inc., Cambridge,Mass., (1997).

Additionally, methods particularly useful for polyclonal and monoclonalantibody production, isolation, characterization, and use are describedin the following standard references: Harlow et al., Antibodies: ALaboratory Manual, Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y., 1988); Harlow et al., Using Antibodies: A LaboratoryManual, Cold Spring Harbor Laboratory Press, Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y., 1998; Monoclonal Antibodiesand Hybridomas: A New Dimension in Biological Analyses, Plenum Press,New York, N.Y. (1980); Zola et al., in Monoclonal Hybridoma Antibodies:Techniques and Applications, CRC Press, 1982).

As will be understood by one of ordinary skill in the art, eachembodiment disclosed herein can comprise, consist essentially of orconsist of its particular stated element, step, ingredient or component.Thus, the terms “include” or “including” should be interpreted torecite: “comprise, consist of, or consist essentially of.” As usedherein, the transition term “comprise” or “comprises” means includes,but is not limited to, and allows for the inclusion of unspecifiedelements, steps, ingredients, or components, even in major amounts. Thetransitional phrase “consisting of” excludes any element, step,ingredient or component not specified. The transition phrase “consistingessentially of” limits the scope of the embodiment to the specifiedelements, steps, ingredients or components and to those that do notmaterially affect the embodiment. As used herein, a material effectwould cause a statistically-significant reduction in the ability todiagnose a sarcoidosis subject from a healthy subject or a sarcoidosissubject from a tuberculosis subject.

Unless otherwise indicated, all numbers expressing quantities ofingredients, properties such as molecular weight, reaction conditions,and so forth used in the specification and claims are to be understoodas being modified in all instances by the term “about.” Accordingly,unless indicated to the contrary, the numerical parameters set forth inthe specification and attached claims are approximations that may varydepending upon the desired properties sought to be obtained by thepresent invention. At the very least, and not as an attempt to limit theapplication of the doctrine of equivalents to the scope of the claims,each numerical parameter should at least be construed in light of thenumber of reported significant digits and by applying ordinary roundingtechniques. When further clarity is required, the term “about” has themeaning reasonably ascribed to it by a person skilled in the art whenused in conjunction with a stated numerical value or range, i.e.denoting somewhat more or somewhat less than the stated value or range,to within a range of ±20% of the stated value; ±19% of the stated value;±18% of the stated value; ±17% of the stated value; ±16% of the statedvalue; ±15% of the stated value; ±14% of the stated value; ±13% of thestated value; ±12% of the stated value; ±11% of the stated value; ±10%of the stated value; ±9% of the stated value; ±8% of the stated value;±7% of the stated value; ±6% of the stated value; ±5% of the statedvalue; ±4% of the stated value; ±3% of the stated value; ±2% of thestated value; or ±1% of the stated value.

Notwithstanding that the numerical ranges and parameters setting forththe broad scope of the invention are approximations, the numericalvalues set forth in the specific examples are reported as precisely aspossible. Any numerical value, however, inherently contains certainerrors necessarily resulting from the standard deviation found in theirrespective testing measurements.

The terms “a,” “an,” “the” and similar referents used in the context ofdescribing the invention (especially in the context of the followingclaims) are to be construed to cover both the singular and the plural,unless otherwise indicated herein or clearly contradicted by context.Recitation of ranges of values herein is merely intended to serve as ashorthand method of referring individually to each separate valuefalling within the range. Unless otherwise indicated herein, eachindividual value is incorporated into the specification as if it wereindividually recited herein. All methods described herein can beperformed in any suitable order unless otherwise indicated herein orotherwise clearly contradicted by context. The use of any and allexamples, or exemplary language (e.g., “such as”) provided herein isintended merely to better illuminate the invention and does not pose alimitation on the scope of the invention otherwise claimed. No languagein the specification should be construed as indicating any non-claimedelement essential to the practice of the invention.

Groupings of alternative elements or embodiments of the inventiondisclosed herein are not to be construed as limitations. Each groupmember may be referred to and claimed individually or in any combinationwith other members of the group or other elements found herein. It isanticipated that one or more members of a group may be included in, ordeleted from, a group for reasons of convenience and/or patentability.When any such inclusion or deletion occurs, the specification is deemedto contain the group as modified thus fulfilling the written descriptionof all Markush groups used in the appended claims.

Certain embodiments of this invention are described herein, includingthe best mode known to the inventors for carrying out the invention. Ofcourse, variations on these described embodiments will become apparentto those of ordinary skill in the art upon reading the foregoingdescription. The inventor expects skilled artisans to employ suchvariations as appropriate, and the inventors intend for the invention tobe practiced otherwise than specifically described herein. Accordingly,this invention includes all modifications and equivalents of the subjectmatter recited in the claims appended hereto as permitted by applicablelaw. Moreover, any combination of the above-described elements in allpossible variations thereof is encompassed by the invention unlessotherwise indicated herein or otherwise clearly contradicted by context.

Furthermore, numerous references have been made to patents, printedpublications, journal articles and other written text throughout thisspecification (referenced materials herein). Each of the referencedmaterials are individually incorporated herein by reference in theirentirety for their referenced teaching.

In closing, it is to be understood that the embodiments of the inventiondisclosed herein are illustrative of the principles of the presentinvention. Other modifications that may be employed are within the scopeof the invention. Thus, by way of example, but not of limitation,alternative configurations of the present invention may be utilized inaccordance with the teachings herein. Accordingly, the present inventionis not limited to that precisely as shown and described.

The particulars shown herein are by way of example and for purposes ofillustrative discussion of the preferred embodiments of the presentinvention only and are presented in the cause of providing what isbelieved to be the most useful and readily understood description of theprinciples and conceptual aspects of various embodiments of theinvention. In this regard, no attempt is made to show structural detailsof the invention in more detail than is necessary for the fundamentalunderstanding of the invention, the description taken with the drawingsand/or examples making apparent to those skilled in the art how theseveral forms of the invention may be embodied in practice.

Definitions and explanations used in the present disclosure are meantand intended to be controlling in any future construction unless clearlyand unambiguously modified in the following examples or when applicationof the meaning renders any construction meaningless or essentiallymeaningless. In cases where the construction of the term would render itmeaningless or essentially meaningless, the definition should be takenfrom Webster's Dictionary, 3rd Edition or a dictionary known to those ofordinary skill in the art, such as the Oxford Dictionary of Biochemistryand Molecular Biology (Ed. Anthony Smith, Oxford University Press,Oxford, 2004).

What is claimed is:
 1. A kit comprising: a protein that binds Smallinducible cytokine A21 precursor (CCL21); Methionine aminopeptidase 1(Metap1); Activated RNA polymerase II transcription cofactor variant 4(PC4); RNA methyltransferase (CLI_3190); Tumor necrosis factor receptorsuperfamily member 21 precursor (TNFRSF21); Monocyte differentiationantigen CD14 (CD14); DnaJ (Hsp40) homolog subfamily C member 1 precursor(DNAJC1); Amyloid β A4 precursor protein binding family B member1-interacting protein (APBB1); Fibroblast growth factor binding protein2 precursor (FGFBP-2); or SH3 domain-containing YSC84 like protein 1(SH3YL1) and a detectable label; a nucleic acid that binds a geneencoding CCL21; Metap1; PC4; CLI_3190; TNFRSF21; DNAJC1; APBB1; FGFBP-2;or SH3YL 1 and a detectable label; a protein that binds Ferredoxin (FedA); WDFY3 protein (WDFY3); Membrane protein (MFS); Leucine richPPR-motif containing protein (LRPPRC); HLA-DR alpha (HLA-DR);Transketolase (TKT); Dihydroxy acid dehydratase (Rv0189C); Chain AMycobacterium tuberculosis (BfrA); Disabled homolog 2 isoform 2 (DAB2);or Transcription elongation factor B polypeptide 2 isoform (Homosapiens) (TCEB2) and a detectable label; or a nucleic acid that binds agene encoding Fed A; WDFY3; MFS; LRPPRC; HLA-DR; TKT; Rv0189C; BfrA;DAB2; or TCEB2 and a detectable label.
 2. The kit according claim 1,wherein the kit comprises more than one protein, and the proteinscomprise antibodies, epitopes or mimotopes.
 3. The kit according toclaim 1, wherein the detectable label comprises a radioactive isotope,enzyme, dye, fluorescent dye, magnetic bead, or biotin.
 4. The kitaccording claim 1, wherein the kit comprises reagents to perform anenzyme-linked immunosorbent assay (ELISA), a radioimmunoassay (RIA), aWestern blot, an immunoprecipitation, an immunohistochemical staining,flow cytometry, fluorescence-activated cell sorting (FACS), an enzymesubstrate color method, and/or an antigen-antibody agglutination.
 5. Amicroarray comprising: a protein that binds cytokine A21 precursor(CCL21); Methionine aminopeptidase 1 (Metap1); Activated RNA polymeraseII transcription cofactor variant 4 (PC4); RNA methyltransferase(CLI_3190); Tumor necrosis factor receptor superfamily member 21precursor (TNFRSF21); Monocyte differentiation antigen CD14 (CD14); DnaJ(Hsp40) homolog subfamily C member 1 precursor (DNAJC1); Amyloid β A4precursor protein binding family B member 1-interacting protein (APBB1);Fibroblast growth factor binding protein 2 precursor (FGFBP-2); or SH3domain-containing YSC84 like protein 1 (SH3YL1); a protein that bindsFerredoxin (Fed A); WDFY3 protein (WDFY3); Membrane protein (MFS);Leucine rich PPR-motif containing protein (LRPPRC); HLA-DR alpha(HLA-DR); Transketolase (TKT); Dihydroxy acid dehydratase (Rv0189C);Chain A Mycobacterium tuberculosis (BfrA); Disabled homolog 2 isoform 2(DAB2); or Transcription elongation factor B polypeptide 2 isoform (Homosapiens) (TCEB2); a nucleic acid that binds to a gene encoding CCL21;Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; orSH3YL1; a nucleic acid that binds a gene encoding Fed A; WDFY3; MFS;LRPPRC; HLA-DR; TKT; Rv0189C; BfrA; DAB2; or TCEB2; one or more ofCCL21; Metap1; PC4; CLI_3190; TNFRSF21; CD14; DNAJC1; APBB1; FGFBP-2; orSH3YL1; or one or more of Fed A; WDFY3; MFS; LRPPRC; HLA-DR; TKT;Rv0189C; BfrA; DAB2; or TCEB2.
 6. A method, comprising obtaining asample from the subject; assaying the sample for: one or more markersselected from cytokine A21 precursor (CCL21); Methionine aminopeptidase1 (Metap1); Activated RNA polymerase II transcription cofactor variant 4(PC4); RNA methyltransferase (CLI_3190); Tumor necrosis factor receptorsuperfamily member 21 precursor (TNFRSF21); Monocyte differentiationantigen CD14 (CD14); DnaJ (Hsp40) homolog subfamily C member 1 precursor(DNAJC1); Amyloid β A4 precursor protein binding family B member1-interacting protein (APBB1); Fibroblast growth factor binding protein2 precursor (FGFBP-2); or SH3 domain-containing YSC84 like protein 1(SH3YL1); or one or markers selected from WDFY3 protein (WDFY3);Membrane protein (MFS); Leucine rich PPR-motif containing protein(LRPPRC); HLA-DR alpha (HLA-DR); Chain A Mycobacterium tuberculosis(BfrA); Disabled homolog 2 isoform 2 (DAB2); or Transcription elongationfactor B polypeptide 2 isoform (Homo sapiens) (TCEB2).
 7. The method ofclaim 6, wherein: the sample is a tissue sample, a cell sample, a wholeblood sample, a serum sample, a plasma sample, a saliva sample, a sputumsample, or a urine sample; and/or assaying the sample for one or moremarkers comprises contacting the sample with a probe comprising adetectable label and that binds the one or more markers; and/orobtaining a value based on the assay comprises quantitating the amountof the marker in the sample; and/or the value is a score or a weightedscore.
 8. The method according to claim 6, wherein assaying the samplefor one or more markers comprises contacting the sample with a probecomprising a detectable label and that binds the one or more markers. 9.The method according to claim 6, wherein obtaining a value based on theassay comprises quantitating the amount of the marker in the sample. 10.The method according to claim 9, wherein the value is a score.
 11. Themethod according to claim 10, wherein the score is a weighted score. 12.The method of claim 6, wherein the sample is assayed for two or moremarkers selected from CCL21, Metap1, PC4, CLI_3190, TNFRSF21, CD14,DNAJC1, APBB1, FGFBP-2, or SH3YL1.
 13. The method of claim 6, furthercomprising: obtaining a value based on the assay; comparing the value toa reference level; diagnosing the subject as having sarcoidosis based onthe up- or downregulation of the one or more markers as demonstrated bythe value and the reference level; and treating the subject diagnosedwith having sarcoidosis with a sarcoidosis treatment comprising one ormore of lifestyle and behavioral interventions, corticosteroids,methotrexate or azathioprine, hydroxychloroquine or chloroquine,cyclophosphamide or chlorambucil, pentoxifylline and thalidomide,infliximab or adalimumab, colchicine, various nonsteroidalanti-inflammatory drugs (NSAIDs), and/or organ transplantation.
 14. Themethod of claim 6, wherein the sample is assayed for one or markersselected from WDFY3, MFS, LRPPRC, HLA-DR, BfrA, DAB2, or TCEB2.
 15. Themethod of claim 6, further comprising: obtaining a value based on theassay; comparing the value to a reference level; and diagnosing thesubject as having tuberculosis based on the upregulation of one or moreof WDFY3, MFS, LRPPRC, and/or HLA-DR, or the downregulation of the oneor more of BfrA, DAB2, and/or TCEB2, as demonstrated by the value andthe reference level; and treating the subject diagnosed as havingtuberculosis with a tuberculosis treatment comprising one or more ofisoniazid (INH), rifampin (RIF), ethambutol (EMB), or pyrazinamide(PZA).
 16. A method of diagnosis a subject as having sarcoidosis ratherthan tuberculosis and treating sarcoidosis in the subject, the methodcomprising: obtaining a sample derived from the subject; assaying thesample for HLA-DR alpha (HLA-DR) and Membrane protein (MFS); obtaining avalue based on the assay; comparing the value to a reference level;diagnosing the subject as having sarcoidosis rather than tuberculosisbased on the up- or down-regulation of the one or more markers asdemonstrated by the value and the reference level; and treating thesubject diagnosed as having sarcoidosis with a sarcoidosis treatmentcomprising one or more of a corticosteroid, methotrexate, azathioprine,hydroxychloroquine, chloroquine, cyclophosphamide, chlorambucil,pentoxifylline and thalidomide, infliximab, adalimumab, colchicine, anonsteroidal anti-inflammatory drug (NSAID), and/or organtransplantation.
 17. The method of claim 16, comprising further assayingthe sample for one or more markers selected from WDFY3 protein (WDFY3);Leucine rich PPR-motif containing protein (LRPPRC); Chain AMycobacterium tuberculosis (BfrA); Disabled homolog 2 isoform 2 (DAB2);Transcription elongation factor B polypeptide 2 isoform (Homo sapiens)(TCEB2); Small inducible cytokine A21 precursor (CCL21); Methionineaminopeptidase 1 (Metap1); Activated RNA polymerase II transcriptioncofactor variant 4 (PC4); RNA methyltransferase (CLI_3190); Tumornecrosis factor receptor superfamily member 21 precursor (TNFRSF21);Monocyte differentiation antigen CD14 (CD14); DnaJ (Hsp40) homologsubfamily C member 1 precursor (DNAJC1); Amyloid β A4 precursorprotein-binding family B member 1-interacting protein (APBB1);Fibroblast growth factor binding protein 2 precursor (FGFBP-2); or SH3domain-containing YSC84 like protein 1 (SH3YL1).
 18. The method of claim17, comprising diagnosing the subject as having tuberculosis based onthe upregulation of one or more of WDFY3, MFS, LRPPRC, and/or HLA-DR, orthe downregulation of the one or more of BfrA, DAB2, and/or TCEB2, asdemonstrated by the value and the reference level.